Re: poll(): IN/OUT vs {RD,WR}NORM
On Tue, May 28, 2024 at 02:33:48 +, David Holland wrote: > anything other than the same set of vague descriptions we had in the > older poll(2). poll(2) is ... ok, I'm not even sure what adjective to use here. I had to write some async TCP poll code that needed to work on Linux, Solaris and MacOS, and I also tested it on NetBSD - the behavior was fairly noticably different (half-close was half the fun). Yet all behaviors were conforming to what the vaguue descriptions in (various) poll(2) manpages said. -uwe
Re: [RFC] new APIs to use wskbd(4) input on non-wsdisplayttydevices
On Mon, Apr 08, 2024 at 23:27:11 +0900, Izumi Tsutsui wrote: > macallan@ wrote: [...] > > Oh, so it's an entire terminal emulation, not just something that lets > > you draw characters? > > Ah, maybe I see misunderstandings among us. > > In sgimips crmfb and newport cases, a putchar() function provided > by the firmware just draws a character (glyph) at the cursor > or specified position. All virtual terminal emulation ops are > done in wsdisplay(9) and vcons(9) layer and MD drivers just draw > characters (or whitespace) per vcons pseudo text VRAM attributes. > > On the other hand, news68k (and sun) machines have putchar() > that also handles virtual terminal ops like backspace, CR/LF, > and even scrolling at the bottom of screen. In this case > no VT emulation layer is necessary in the kernel side, > so kernel's putc(9) just calls firmware's putchar(), > and for userland processes we can simply pass translated > wskbd inputs to line discipline of the tty device. > > That's the reason why I proposed to add register/deregister > APIs to pass wskbd data to romcons tty device. > > What do you think about this case? Add trivial wsemul_none (or wsemul_delegate, or whatever a suitable name might be) that does even less than wsemul_dumb and only ever uses putchar to pass chars to the firmware emulator? -uwe
Re: [RFC] new APIs to use wskbd(4) input on non-wsdisplay tty devices
On Sat, Apr 06, 2024 at 23:56:27 +0900, Izumi Tsutsui wrote: > To support "text only" framebuffer console, we can use putchar > functions provided by the firmware PROM. Is that a console-typewriter--like device without addressable cursor terminal emulation? Can you use wsemul_dumb to avoid rasops ? It still uses copy/erase, but with trivial argument values that can be reversed back to puchar calls for tab/lf (from a very quick look). > The attached patch provides new two APIs > - wskbd_consdev_kbdinput_register() > - wskbd_consdev_kbdinput_deregister() > to allow a kernel to use wskbd(9) for non-wsdisplay tty device. AFAIU, there's nothing console-specific in this (except that it's first use is going to be for a console), so may be it would be better to drop the "consdev" from the name? > Index: sys/dev/wscons/wskbd.c > === > RCS file: /cvsroot/src/sys/dev/wscons/wskbd.c,v > retrieving revision 1.143 > diff -u -p -d -r1.143 wskbd.c > --- sys/dev/wscons/wskbd.c5 Feb 2019 10:04:49 - 1.143 > +++ sys/dev/wscons/wskbd.c6 Apr 2024 06:59:50 - [...] > @@ -706,6 +709,24 @@ wskbd_input(device_t dev, u_int type, in > } > #endif > > +#if NWSDISPLAY == 0 > + if (sc->sc_translating) { The #endif above is for NWSDISPLAY > 0, so may be get rid of the ifdefs and use plain ifs? Thanks. -uwe
Re: Change max ttys from 8 to 12?
On Tue, Dec 19, 2023 at 11:10:52 +0100, Dan-Simon Myrland wrote: > 2) Make a custom kernel with the option WSDISPLAY_DEFAULTSCREENS=12 Why? WSDISPLAY_DEFAULTSCREENS is the number of screens pre-created by the kernel, but you can always create as many as you need (subject to WSDISPLAY_MAXSCREEN), see /etc/wscons.conf and /etc/rc.d/wscons. The default for that default is actually 0. > I don't mind that NetBSD has four active ttys by default, but the steps > to enable all 12 seems unnecessarily tedious. I realize that not all > architectures have 12 function keys, but laptops usually do. If the > kernel supported a maximum of 12 ttys on popular architectures, > enabling then would only require steps 1 and 2. But where do you stop? macs have like, what, 19? I don't like the idea. The only real limitation currently is the switching commands/keysyms. WSDISPLAY_MAXSCREEN is a static limit b/c it was easier to hardcode it, but it may be made an option easily as far as I can tell. Switching from a fixed size array to a dynamic one is probably not too much work either. But then, overall, I think that trying to make the kernel substitute for screen, tmux (in base), etc is kinda dead end, so I'd rather we don't encourage it. -uwe
Re: how do I preset ddb's LINES to zero
On Fri, Dec 15, 2023 at 11:19:39 -0500, Andrew Cagney wrote: > I've the stock 10.0 boot.iso booting within a KVM based test > framework. I'd like to set things up so that should there be a panic, > it dumps registers et.al., without stopping half way waiting for > someone to hit the space bar vis: > > r9 af80b451d080 > r10 81d9a063 > r11 0 > r12 0 > --db_more-- > > Is there a way to stop this without having to rebuild the kernel. If the kernel is already there, you can't avoid that prompt without *some* interaction. I don't think you can tweak this from boot.cfg There's probably no good default for db_more prompt, as there are situations where someone wants it on and someone off. May be we should make that into a boot argument, so that if a script talks to the console, it can issue the corresponding boot command at a well defined time instead of doing expect-like things? Or may be force the paging off for db_cmd_on_enter. PS: xen console seems to forcibly override db_max_line to avoid paging prompt. -uwe
Re: [RFC] userconf(4) modification
On Thu, Nov 02, 2023 at 16:29:42 +0100, tlaro...@kergis.com wrote: > You will find attached the man page in order to be able to comment > about the proposed new syntax---supplementary syntax: it does not > replace the "legacy" one. The man page is super-confusing. Someone who needs to use userconf to get their system to boot needs a clear reference, but the proposed version tries to be overly formal and ends up a bit opaque. I also don't understand why it is necessary to call the old syntax - "legacy". From the man page my impression is that the command can be either command dev or command property = value both are in a sense a kind of device selector, why do you have to declare one of them "legacy"? The user probably doesn't care much either way, they need to get the kernel booting and are not interested in the lore. Why the thing after = is called "expression"? That position only accepts two kinds of literals, one of which is a shorthand for the other (but I had to re-read that paragraph several times and I'm still not quite sure it actually clearly says that). -uwe
Re: Testing Emulation Syscalls
On Tue, Aug 01, 2023 at 12:39:46 +0200, Martin Husemann wrote: > On Tue, Aug 01, 2023 at 01:34:54PM +0300, Valery Ushakov wrote: > > As for testing emulated syscalls - can we solve this problem with a > > bit of elf branding to convince the kernel to start the binary under > > emulation directly? Inventing a whole new backdoor API for that seems > > kinda an overkill. > > That is probably quite easy to do, but we have a toolchain problem then > (solvable too). > > We need build.sh to be able to produce the test binaries (including > any needed libs, which don't have to be "native" libs of the emulated > system). For simple cases - can we get away with tiny'ish freestanding test programs that invoke the tested syscalls so that we don't have to pull a new cross-compilation setup out of thin air just for that? Testing something like lwp-related calls probably requires a real linux libpthread anyway, not a gum-and-toothpicks effigy. -uwe
Re: Testing Emulation Syscalls
On Tue, Aug 01, 2023 at 08:16:50 +0200, Martin Husemann wrote: > On Mon, Jul 31, 2023 at 05:03:48PM -0400, Theodore Preduta wrote: > > One idea (mentioned in the original thread) would be to introduce a > > syscall along the lines of > > > > int emul_syscall(const char *emul_name, int number, ...) > > > > which executes a single syscall. The flaw with this idea is that state > > may need to be stored across syscalls in struct linux_emuldata, but I > > don't know how this interface could accommodate this. > > > > Another idea would be to introduce a syscall along the lines of > > > > int setemul(const char *emul_name) > > > > which would switch the syscall table dynamically so that the test case > > could be run under emulation (preserving emuldata state) and then switch > > back to report the result. (And then individual syscalls would be > > called via __syscall(2).) > > I think this would be quite tricky for the test code in userland. > > But what about a variant of the initial suggestion: > > // returns an integer descriptor > int open_emul(const char *emulname); > > // invokes a syscall under an open emulation > int emul_syscall(int emul, int number, ...); > > // frees all state for the emulation, returns 0 or -1 > int close_emul(int emul); > > IMO this still is far better than exposing native syscalls that we do > not really need/want. I'd rather we don't expose epoll(2) as a native syscall at least not just yet, given what pkgsrc folks are telling us. It's not like we have pressing need to use it in our own code in base. And for third party software it creates confusion as was pointed out. As for testing emulated syscalls - can we solve this problem with a bit of elf branding to convince the kernel to start the binary under emulation directly? Inventing a whole new backdoor API for that seems kinda an overkill. -uwe
Re: [PATCH] style(5): No struct typedefs
On Tue, Jul 11, 2023 at 05:56:27 -0700, Jason Thorpe wrote: > > On Jul 11, 2023, at 3:17 AM, Taylor R Campbell wrote: > > > > If we used `struct bus_dma_tag *' instead, the forward declaration > > could be `struct bus_dma_tag;' instead of having to pull in all of > > sys/bus.h, _and_ the C compiler would actually check types. > > In the original design, it's not always a struct. That was the > whole point of using a more abstract type. The bus_dma_tag_t example from the original email is not the best one, but I didn't want to open that can of worms in my reply, so I mentioned the "not always struct" case without actually mentioning names. The style(5) specifically gives an example of a struct typedef, not of an opaque typedef. > If you want to hide the struct'ness in a machdep header file, fine, > but I completely disagree with the notion of requiring the use of > the "struct" keyword all over the place. I used to lean both ways at different times and in different contexts. I think that existence and usefulness of opaque typedefs is exactly the strong argument against using "convenience struct typedefs", b/c the latter dilute the message so to speak. If someone wants to program with "systems hungarian", they know where to find it... -uwe
Re: [PATCH] style(5): No struct typedefs
On Tue, Jul 11, 2023 at 10:17:24 +, Taylor R Campbell wrote: > I propose the attached change to KNF style(5) to advise against > typedefs for structs or unions, or pointers to them. [...] > (Typedefs for integer, function, and function pointer types are not > covered by this advice.) Yes, please. Typedefs make sense when the type is *really* opaque and can, behind the scenes, be an integer type, a pointer or a struct. [Ab]using typedefs to save 8 bytes of "struct " + "*" just adds cognitive load (and whatever logistical complications that you have enumerated in the elided part of the quote). -uwe
Re: PROPOSAL: Split uiomove into uiopeek, uioskip
On Tue, May 09, 2023 at 14:33:26 -0700, Jason Thorpe wrote: > I'm not a fan of uioskip() as a name - I think uioadvance() would be > better. Skip implies, to my brain, that the data is being thrown > away (even if you're already consumed it). I agree. "skip" seem to have wrong connotations (cf. dd(1)). -uwe
Re: building 9.1 kernel with /usr/src elsewhere?
On Wed, Mar 08, 2023 at 15:22:11 +1100, matthew green wrote: > > This completed apparently normally, reporting the build directory and > > telling me to remember to make depend. I then went to ~/kbuild/GEN91 > > and ran make depend && make. It failed fast - no more than a second or > > two - with > > > > make[1]: don't know how to make absvdi2.c. Stop > > what happens if you run "make USETOOLS=no"? That's orthogonal. The problem is that NETBSDSRCDIR cannot be inferred for a randomly located kernel builddir and sys/lib/libkern/Makefile.compiler-rt uses it. I don't know enough about sys Makefiles, but may be using $S instead of ${NETBSDSRCDIR}/sys will just fix it. But may be will also break something else. Our makefile spaghetti is a bit out of control. -uwe
Re: The list of __HAVE macros
On Sun, Mar 05, 2023 at 18:48:08 +, Taylor R Campbell wrote: > > I think it might be a good idea to document the list in one place with > > xrefs to the relevant section 9 pages (and to document the options > > there too, e.g. __HAVE_SIMPLE_MUTEXES in mutex(9)). > > I agree! Want to draft a skeleton in share/man/man9, say > portfeatures.9 or something, so we can fill them and xref as needed? Done. If people can document stuff there and in relevant section 9 pages that would be cool. Don't get too concerned with the finer points of mdoc markup, I'll try to clean that up if necessary. -uwe
The list of __HAVE macros
We don't seem to have(9) a man page that lists all __HAVE_* macros that a port may provide. E.g. $ apropos -M '"__HAVE_PREEMPTION"' cpu_need_resched (9)context switch notification but $ apropos -M '"__HAVE_SIMPLE_MUTEXES"' apropos: No relevant results obtained. Please make sure that you spelled all the terms correctly or try using different keywords. I think it might be a good idea to document the list in one place with xrefs to the relevant section 9 pages (and to document the options there too, e.g. __HAVE_SIMPLE_MUTEXES in mutex(9)). -uwe
Re: erlang -> asmjit -> mremap questions/bugs
On Wed, Mar 01, 2023 at 15:29:27 +0100, Thomas Klausner wrote: > It seems the problem is that mmap() in the mremap(2) man page example > (which was used to implement the asmjit version) is not using > MAP_SHARED. > > - I'd like to add MAP_SHARED in the mmap() call in the mremap(2) man > page example. Is that fine? Not really. There are no other processes involved in the example, so MAP_SHARED makes no sense. > - Reading mmap(2) it seems that one of MAP_SHARED or MAP_PRIVATE is > required, but there is no error if none is provided. Should we change > mmap() to return an error in that case? sys/kern/vfs_vnops.c: /* * Old programs may not select a specific sharing type, so * default to an appropriate one. */ > - Why does MAP_PRIVATE (instead of MAP_SHARED) not work? Are there multiple processes involved in the erlang case? -uwe
Re: Finding the slot in the ioconf table a module attaches to?
On Wed, Feb 01, 2023 at 11:14:42 -0800, Brian Buhrow wrote: > hello. Okay. That is helpful. Passing -1 in as the cmajor > number to the devsw_attach() function does, in fact, assign a > reasonable major number which seems to work. I use the > cdevsw_lookup_major() function to retrieve the assigned number and > print it for the user. devsw_attach updates with the assigned number if you passed NODEVMAJOR (-1) in it, so you don't even need to look it up separately. We also have in-kernel convenience "MAKEDEV". E.g., paraphrasing a bit, vbox guest additions module does: bmajor = cmajor = NODEVMAJOR; error = devsw_attach("vboxguest", NULL, , _cdevsw, ); if (error) ... error = do_sys_mknod(curlwp, "/dev/vboxguest", S_IFCHR | 0666, makedev(cmajor, 0), , UIO_SYSSPACE); if (error == EEXIST) { error = 0; /* * Since NetBSD doesn't yet have a major reserved for * vboxguest, the (first free) major we get will * change when new devices are added, so an existing * /dev/vboxguest may now point to some other device, * creating confusion (tripped me up a few times). */ aprint_normal("vboxguest: major %d:" " check existing /dev/vboxguest\n", cmajor); } (The comment is no longer true, as we do have a reserved major for vbox now). -uwe
Re: Finding the slot in the ioconf table a module attaches to?
On Wed, Feb 01, 2023 at 08:27:57 -0500, Brad Spencer wrote: > To add a bit... generally I have just added an entry to one of the > "major" files in sys/conf. However, I have noticed that in order for > the module to be able to use it, after the major file edit, I had to > rebuild the kernel as well. I have never been 100% sure that was proper > behavior, but it seems to be the case. That is, just editing the major > file and building or rebuilding the module has not been enough. .Xr devsw_attach 9 Major numbers (mapping "foo" -> 42, so that a program that opens a node with major 42 gets to the device "foo") are a property of the kernel config, so yes, you need to rebuild the kernel when you introduce a new fixed major number. When the module is loaded, the driver tells the kernel, "I'm `foo'". It can also tell the kernel either "I'm ok with whatever major number you give me", or it can tell "I want a specific major number N". It's an error to request a specific major N that is already taken (either fixed in a major config file, or dynamically allocated to another driver). -uwe
Re: Finding the slot in the ioconf table a module attaches to?
On Wed, Feb 01, 2023 at 08:28:35 +, RVP wrote: > /usr/src/sys/modules/examples/readhappy/readhappy.c > /usr/src/sys/conf/majors* Hmm, lots of real modules seems to use config_init_component() that is not documented at all in the section 9. Can someone please write a man page for that? I'll help with mdoc if troff incantations make you anxious :) -uwe
Re: Add five new escape sequences to wscons
On Mon, Jan 16, 2023 at 15:10:06 -0300, Crystal Kolipe wrote: > On Mon, Jan 16, 2023 at 08:20:35PM +0300, Valery Ushakov wrote: > > On Mon, Jan 16, 2023 at 09:18:53 -0300, Crystal Kolipe wrote: > > > > > It's useful, because these sequences correspond to the terminfo > > > capabilities rin, indn, vpa, hpa, and cbt as defined in the xterm > > > terminfo entry. With these sequences implemented, it becomes > > > slightly more practical to set TERM=xterm when connecting to remote > > > systems that don't have a comprehensive terminfo database. > > > > Why is is desirable to set specifically TERM=xterm instead of, say, > > vt220, or whichever vt entry describes wscons the closest? > > The xterm entry supports colour, which vt220 does not. As someone who routinely runs xterm with TERM=vt220 I'm probably not qualified to comment further. > The multi-line scroll commands, as far as I understand, are supposed to > scroll the entire screen, (or the scrolling region). It's the "or the scrolling region" part that I'm not sure about. The terminfo documentation seems to indicate that the scrolling capabilities like "ind" are to operate on the whole screen. E.g. X/Open Curses, Issue 7 (p.353): To scroll text up, a program goes to the bottom left CORNER OF THE SCREEN and sends the ind (index) string. To scroll text down, a program goes to the top left CORNER OF THE SCREEN and sends the ri (reverse index) string. The strings ind and ri are undefined when not on their respective corners of the screen. On the other hand a few pages later the same document says (p.356): To determine whether a terminal has destructive scrolling regions or non-destructive scrolling regions, create a scrolling region in the middle of the screen, place data on the bottom line of the scrolling region, move the cursor to the top LINE OF THE SCROLLING REGION, and do a reverse index (ri) followed by a delete line (dl1) or index (ind). If the data that was originally on the bottom line of the scrolling region was restored into the scrolling region by dl1 or ind, then the terminal has non-destructive scrolling regions. Otherwise, it has destructive scrolling regions. I cannot find any passages that would explicitly say how ind/ri and csr interact. (Note, I'm not talking about the observed behaviour of specific xterm/vt commands, but about the semantic of terminfo capabilities as abstractly defined in the ETI). May be it's so obvious to everyone involved that "ind" and "ri" and to operate on the scrolling region that no-one even realizes that the current wording does actually say something different and you need to do exegetics on an tangential remark elsewhere in the document to be kinda able to infer that it's "screen (or the scrolling region)" -uwe
Re: Add five new escape sequences to wscons
On Mon, Jan 16, 2023 at 09:18:53 -0300, Crystal Kolipe wrote: > It's useful, because these sequences correspond to the terminfo > capabilities rin, indn, vpa, hpa, and cbt as defined in the xterm > terminfo entry. With these sequences implemented, it becomes > slightly more practical to set TERM=xterm when connecting to remote > systems that don't have a comprehensive terminfo database. Why is is desirable to set specifically TERM=xterm instead of, say, vt220, or whichever vt entry describes wscons the closest? For multi-line scroll the patch just calls scrollup/scrolldown, but that's not what the single-line scroll commands do (see wsemul_vt100.c) I'm actually not entirely convinced that it's even correct to describe vt220 as having sf/ind scrolling capabilities, b/c the vt220 scrolling sequences take the scrolling region into account and the terminfo capabilities for scrolling are defined to operate on the whole screen as far as I can tell. So in its current form I don't think this patch is suitable and I'm not convinced it's needed at all. -uwe
Re: KSYMS_CLOSEST
On Sun, Dec 25, 2022 at 17:41:10 +0100, Anders Magnusson wrote: > Den 2022-12-25 kl. 17:25, skrev Valery Ushakov: > > On Sun, Dec 25, 2022 at 15:42:47 +0100, Anders Magnusson wrote: > > > > > Den 2022-12-25 kl. 13:43, skrev Valery Ushakov: > > > > On Sun, Dec 25, 2022 at 09:20:49 +0100, Anders Magnusson wrote: > > > > > > > > > IIRC it was to match the ddb "sift" command. > > > > I'm not sure I get how it might be used for sifting - a kind of "next" > > > > for external iteration? Since we never got around to do that do we > > > > still want to keep it, or shall we deprecate/delete it? > > > > > > Ah! I had to look at the code - no, it has nothing to do with sift. > > > I think it is implicit when asking for a name these days; it is used > > > to get nearest lower address address in debug output. (like > > > tstile+0x18 ) > > > > Right, right, but I wonder what could it possibly mean then, when the > > flag is not specified - as opposed to the example above. I.e. if > > KSYMS_CLOSEST is foo+0x10, what KSYMS_EXTERN (i.e. no specific flags) > > could be, other than foo+0x10, for the same address? I mean, > > technically, netbsd + 0xcaffe42 would also be a correct reply in that > > case :) > > :-) If you are not specifying KSYMS_EXACT, you may not get the exact > address, yes. That is true :-) > > > Also, checking the very first versions of ksyms code I don't see > > KSYMS_CLOSEST ever actually handled (it's defined and specified in the > > ddb strategy defines, but never tested in ksyms). May be I missed > > some later short-lived incarnation. > > > > The existing call sites that supply the flag look like cargo-cult^W^W > > common sense ("looks like you might need to specify that flag to get > > foo+0x10, well, *shrug*, won't hurt"). > I assume that might be the case, yes. > The ksyms code comes from another system for which I wrote it a long time > ago, where the meaning may have had a significance (do not remember). > But feel free to clean this up. (IMHO KSYMS_EXACT should be the default, > requiring KSYMS_CLOSEST to be defined if that is requested). But KSYMS_EXACT has different meaning. It means to look for exactly "foo" (foo+0) and fail otherwise. if ((f & KSYMS_EXACT) && (v != es->st_value)) return ENOENT; -uwe
Re: KSYMS_CLOSEST
On Sun, Dec 25, 2022 at 15:42:47 +0100, Anders Magnusson wrote: > Den 2022-12-25 kl. 13:43, skrev Valery Ushakov: > > On Sun, Dec 25, 2022 at 09:20:49 +0100, Anders Magnusson wrote: > > > > > IIRC it was to match the ddb "sift" command. > > I'm not sure I get how it might be used for sifting - a kind of "next" > > for external iteration? Since we never got around to do that do we > > still want to keep it, or shall we deprecate/delete it? > > Ah! I had to look at the code - no, it has nothing to do with sift. > I think it is implicit when asking for a name these days; it is used > to get nearest lower address address in debug output. (like > tstile+0x18 ) Right, right, but I wonder what could it possibly mean then, when the flag is not specified - as opposed to the example above. I.e. if KSYMS_CLOSEST is foo+0x10, what KSYMS_EXTERN (i.e. no specific flags) could be, other than foo+0x10, for the same address? I mean, technically, netbsd + 0xcaffe42 would also be a correct reply in that case :) Also, checking the very first versions of ksyms code I don't see KSYMS_CLOSEST ever actually handled (it's defined and specified in the ddb strategy defines, but never tested in ksyms). May be I missed some later short-lived incarnation. The existing call sites that supply the flag look like cargo-cult^W^W common sense ("looks like you might need to specify that flag to get foo+0x10, well, *shrug*, won't hurt"). -uwe
Re: KSYMS_CLOSEST
On Sun, Dec 25, 2022 at 09:20:49 +0100, Anders Magnusson wrote: > IIRC it was to match the ddb "sift" command. I'm not sure I get how it might be used for sifting - a kind of "next" for external iteration? Since we never got around to do that do we still want to keep it, or shall we deprecate/delete it? > Den 2022-12-25 kl. 01:01, skrev Valery Ushakov: > > KSYMS_CLOSEST flag is documented as "Nearest lower match". However as > > far as I can tell nothing in ksyms code ever pays attention to this > > flag and it's not clear to me what meaning one can ascribe to the set > > of flags that doesn't have KSYMS_CLOSEST set. > > > > Ragge, do you remember what did you have in mind for it when you > > introduced it back in 2003? > > > > I think we should g/c it. > > > > -uwe -uwe
KSYMS_CLOSEST
KSYMS_CLOSEST flag is documented as "Nearest lower match". However as far as I can tell nothing in ksyms code ever pays attention to this flag and it's not clear to me what meaning one can ascribe to the set of flags that doesn't have KSYMS_CLOSEST set. Ragge, do you remember what did you have in mind for it when you introduced it back in 2003? I think we should g/c it. -uwe
Re: symbol lookup in ddb - bad heuristic
On Sun, Dec 18, 2022 at 09:38:14 -0800, Chuck Silvers wrote: > > May be the hack need to be applied only with a new special flag, say, > > KSYMS_RET? Then we can define separate DB_STGY_PROC (no heuristic) > > and DB_STGY_RET (with the heuristic). > > > > The downside is that all MD db_stack_trace_print functions need to be > > adjusted, but it actually makes sense to use both strategies there, > > b/c when we are traversing an interrupt/exception frame, the > > DB_STACK_PROC (without the heuristic) is the right thing to use, but > > unwinding a call needs DB_STACK_RET (with the new flag). > > you're right, to print the right function name we do need to distinguish > between addresses that are function return addresses and those that are not, > and the DB_STGY_RET / KSYMS_RET flags that you suggest sound like a fine way > of doing that. would you like to implement this or do you want me to do it? I probably won't get around to it until the next weekend at the earliest and I'm not territorial about this :), so if you have time and inclination, please do, otherwise I'll do it when I get around to it. Thanks. -uwe
Re: i386: 9.99.108 traps booting on VirtualBox
On Mon, Dec 12, 2022 at 23:31:06 +0300, Valery Ushakov wrote: > > > With KDTRACE_HOOKS enabled (modulo clockintr hack) and the serial > > > console (for debugging) I see the system stuck on console output when > > > rc runs. It gets unstuck on a com interrupt (e.g. pressing a key). > > > > > > Seems to work fine with KDTRACE_HOOKS disabled. > > > > Do you mean that: > > > > - with KDTRACE_HOOKS enabled, clockintr hack applied, and console on > > serial, system gets stuck on console output until com interrupt > > Yes, I get some of the early output from rc and then the system > stalls. There's no further rc output and I don't get a login prompt > on the wscons. When I type a key into the serial console, the output > gets unstuck and I get the rest of the rc output and the login prompt > on wscons. PS: This is not on real hardware though, but under VirtualBox with serial connected to a TCP port. I have an ancient Dell laptop with a real on-board serial that I can probably try to verify this with if need be. -uwe
Re: i386: 9.99.108 traps booting on VirtualBox
On Mon, Dec 12, 2022 at 20:12:57 +, Taylor R Campbell wrote: > Annoying... We really shouldn't abuse function prototypes like this: > according to the prototype, what I did with intr_kdtrace_wrapper is > correct. Right, we decieved the compiler and the compiler was like, ok, boomer... > I think it would be reasonable to add an exception like you did for > now, maybe with an INTR_NOTRACE flag (perhaps someone can find a way > to phrase this positively) instead of a magic number, until we can > remove the abuse of calling convention for clockintr. As I said, it was just a quick kludge to avoid a bunch of files recompiled (and I didn't even get the number right...). > > With KDTRACE_HOOKS enabled (modulo clockintr hack) and the serial > > console (for debugging) I see the system stuck on console output when > > rc runs. It gets unstuck on a com interrupt (e.g. pressing a key). > > > > Seems to work fine with KDTRACE_HOOKS disabled. > > Do you mean that: > > - with KDTRACE_HOOKS enabled, clockintr hack applied, and console on > serial, system gets stuck on console output until com interrupt Yes, I get some of the early output from rc and then the system stalls. There's no further rc output and I don't get a login prompt on the wscons. When I type a key into the serial console, the output gets unstuck and I get the rest of the rc output and the login prompt on wscons. > - with KDTRACE_HOOKS disabled, and console on serial, system proceeds > without getting stuck on console output? Yes. -uwe
Re: symbol lookup in ddb - bad heuristic
On Sat, Dec 10, 2022 at 01:03:06 +0300, Valery Ushakov wrote: > That causes breakpoints on a function entry to be misreported: Actually it's more than that. The corresponding MD change in i386 db_frame_info that applies the same heuristic causes another side effect. With the heuristic I get the following backtrace from the breakpoint at clockintr for the real problem I've been debugging (see my earlier mail): db{0}> bt sysbeepdetach(c2f50680,c1930d9c,0,0,0,0,0,0,0,0) at netbsd:clockintr --- switch to interrupt stack --- but with the MD part of the heuristic also disabled (I missed it originally), I get: db{0}> bt clockintr(0,0,0,0,0,0,0,0,c2d72000,c010322a) at netbsd:clockintr intr_kdtrace_wrapper(c2f50680,c1930d9c,0,0,0,0,0,0,0,0) at netbsd:intr_kdtrace_wrapper+0x21 --- switch to interrupt stack --- Yes, I should have realized I did see that intr_kdtrace_wrapper in another backtrace, taken earlier, further down the call chain: db{0}> bt hardclock(0,0,da3eef6c,c04ac8f1,0,0,0,0,0,0) at netbsd:hardclock+0x23 clockintr(0,0,0,0,0,0,0,0,c2d72000,c010322a) at netbsd:clockintr+0x2a intr_kdtrace_wrapper(c2f50680,c1930d9c,0,0,0,0,0,0,0,0) at netbsd:intr_kdtrace_wrapper+0x21 --- switch to interrupt stack --- but it kinda drifted out of focus... -uwe
Re: symbol lookup in ddb - bad heuristic
On Sat, Dec 10, 2022 at 01:03:06 +0300, Valery Ushakov wrote: > KSYMS_RET? Then we can define separate DB_STGY_PROC (no heuristic) > and DB_STGY_RET (with the heuristic). > > The downside is that all MD db_stack_trace_print functions need to be > adjusted, but it actually makes sense to use both strategies there, > b/c when we are traversing an interrupt/exception frame, the > DB_STACK_PROC (without the heuristic) is the right thing to use, but > unwinding a call needs DB_STACK_RET (with the new flag). PS: Grr. Obviously, I meant to say DB_STGY_PROC and DB_STGY_RET here. -uwe
Re: i386: 9.99.108 traps booting on VirtualBox
[ATTN: riastradh] On Fri, Dec 09, 2022 at 02:59:12 +0300, Valery Ushakov wrote: > [reposting from current-users] > > On Wed, Nov 30, 2022 at 13:05:52 +0300, Valery Ushakov wrote: > > > I tried to upgrade a 32-bit VBox VM from 9.99.99 to .107 and the > > kernel from the yesterday's sources crashes on boot. > > Tried .108 and it crashes the same with: > [ 1.0091954] trap type 6 code 0 eip 0xc0d3d8f8 cs 0x8 eflags 0x10246 cr2 > 0x3c ilevel 0x7 esp 0x6 > [ 1.0091954] curlwp 0xc1657840 pid 0 lid 0 lowest kstack 0xc192e2c0 > kernel: supervisor trap page fault, code=0 > Stopped in pid 0.0 (system) at netbsd:hardclock+0x23: movl3c(%esi),%eax > db{0}> bt > hardclock(0,0,da3eef6c,c04ac8f1,0,0,0,0,0,0) at netbsd:hardclock+0x23 > clockintr(0,0,0,0,0,0,0,0,c2d72000,c010322a) at netbsd:clockintr+0x2a > intr_kdtrace_wrapper(c2f50680,c1930d9c,0,0,0,0,0,0,0,0) at > netbsd:intr_kdtrace_wrapper+0x21 > --- switch to interrupt stack --- So the culprit is KDTRACE_HOOKS in sys/arch/x86/x86/intr.c revision 1.163 date: 2022-10-29 16:59:04 +0300; author: riastradh; state: Exp; lines: +38 -2; commitid: w28zVvYhMCIOsCZD; x86: Add dtrace probes for interrupt handler entry and return. The problem is that clockintr has magic calling convention that intr_kdtrace_wrapper doesn't know about. As a quick hack I changed i8254_initclocks to pass a magic argument (that is ignored by clockintr anyway) and told the hook code to ignore such handlers: #ifdef KDTRACE_HOOKS if (arg != (void *)0x8042c10c) { /* clockintr is magic */ ih->ih_fun = intr_kdtrace_wrapper; ih->ih_arg = ih; } #endif and that kernel doesn't crash. It's *almost* fine, but I see the problem with com(4) that I suspect is related to the recent commits by Nakahara-san: revision 1.382 date: 2022-12-09 03:35:58 +0300; author: knakahara; state: Exp; lines: +7 -29; commitid: 9zcguFpBLJvxHO4E; Revert com.c:r1.381 because i386/qemu cannot boot. Pointed out by gson@n.o and martin@n.o. revision 1.381 date: 2022-12-08 12:08:49 +0300; author: knakahara; state: Exp; lines: +29 -7; commitid: 0xs100bYdUbwzJ4E; Fix hang up writing /dev/console rarely in specific environments. Some BMC seems to require these syncronous operations. If not, it does not send transmit completion interrupts for some reason. With KDTRACE_HOOKS enabled (modulo clockintr hack) and the serial console (for debugging) I see the system stuck on console output when rc runs. It gets unstuck on a com interrupt (e.g. pressing a key). Seems to work fine with KDTRACE_HOOKS disabled. -uwe
symbol lookup in ddb - bad heuristic
db_printsym has the following heuristic: revision 1.68 date: 2021-12-13 04:25:29 +0300; author: chs; state: Exp; lines: +16 -2; commitid: MT9cIBmUIZU1AqkD; ddb: fix function names of "noreturn" functions in stack traces. when looking up function names for stack traces (where the addresses are the return addresses of function calls), if the address is the first instruction in the function, assume that the function being called is marked "noreturn" and that the function containing the call is actually the function immediately before the address that we looked up. to find the correct function name, do the lookup again with (address - 1) and then add one to the offset within the function that we find. That causes breakpoints on a function entry to be misreported: Breakpoint in pid 0.0 (system) at netbsd:sysbeepdetach+0x21: pushl %ebp ... db{0}> show break Map CountAddress *0x0 1netbsd:sysbeepdetach+0x21 db{0}> x/i sysbeepdetach+0x21 netbsd:clockintr: pushl %ebp May be the hack need to be applied only with a new special flag, say, KSYMS_RET? Then we can define separate DB_STGY_PROC (no heuristic) and DB_STGY_RET (with the heuristic). The downside is that all MD db_stack_trace_print functions need to be adjusted, but it actually makes sense to use both strategies there, b/c when we are traversing an interrupt/exception frame, the DB_STACK_PROC (without the heuristic) is the right thing to use, but unwinding a call needs DB_STACK_RET (with the new flag). Thoughts? -uwe
i386: 9.99.108 traps booting on VirtualBox
[reposting from current-users] On Wed, Nov 30, 2022 at 13:05:52 +0300, Valery Ushakov wrote: > I tried to upgrade a 32-bit VBox VM from 9.99.99 to .107 and the > kernel from the yesterday's sources crashes on boot. Tried .108 and it crashes the same with: > boot netbsd.new 21926532+587532+743668 [994880+103+13802]=0x182cf08 [ 1.000] cpu_rng: rdrand/rdseed [ 1.000] entropy: ready [ 1.000] Copyright (c) 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004, 2005, [ 1.000] 2006, 2007, 2008, 2009, 2010, 2011, 2012, 2013, 2014, 2015, 2016, 2017, [ 1.000] 2018, 2019, 2020, 2021, 2022 [ 1.000] The NetBSD Foundation, Inc. All rights reserved. [ 1.000] Copyright (c) 1982, 1986, 1989, 1991, 1993 [ 1.000] The Regents of the University of California. All rights reserved. [ 1.000] NetBSD 9.99.108 (GENERIC) #0: Fri Dec 9 01:23:00 MSK 2022 [ 1.000] uwe@majava:/home/uwe/work/netbsd/cvs/src/sys/arch/i386/compile/GENERIC [ 1.000] total memory = 1023 MB [ 1.000] avail memory = 980 MB [ 1.040] mainbus0 (root) [ 1.040] ACPI: RSDP 0x000E 24 (v02 VBOX ) [ 1.040] ACPI: XSDT 0x3FFF0030 34 (v01 VBOX VBOXXSDT 0001 ASL 0061) [ 1.040] ACPI: FACP 0x3FFF00F0 F4 (v04 VBOX VBOXFACP 0001 ASL 0061) [ 1.040] ACPI: DSDT 0x3FFF05B0 002353 (v02 VBOX VBOXBIOS 0002 INTL 20200925) [ 1.040] ACPI: FACS 0x3FFF0200 40 [ 1.040] ACPI: SSDT 0x3FFF0240 00036C (v01 VBOX VBOXCPUT 0002 INTL 20200925) [ 1.040] ACPI: 2 ACPI AML tables successfully acquired and loaded [ 1.040] cpu0 at mainbus0 [ 1.040] cpu0: Intel(R) Core(TM) i5-5300U CPU @ 2.30GHz, id 0x306d4 [ 1.040] cpu0: node 0, package 0, core 0, smt 0 [ 1.040] acpi0 at mainbus0: Intel ACPICA 20220331 [ 1.040] acpi0: fixed power button present [ 1.040] acpi0: fixed sleep button present [ 1.0091954] pckbc1 at acpi0 (PS2K, PNP0303) (kbd port): io 0x60,0x64 irq 1 [ 1.0091954] pckbc2 at acpi0 (PS2M, PNP0F03) (aux port): irq 12 [ 1.0091954] attimer1 at acpi0 (TIMR, PNP0100): io 0x40-0x43,0x50-0x53 [ 1.0091954] SRL0 (PNP0501) at acpi0 not configured [ 1.0091954] acpivga0 at acpi0 (GFX0): ACPI Display Adapter [ 1.0091954] acpiout0 at acpivga0 (VGA, 0x0100): ACPI Display Output Device [ 1.0091954] acpibat0 at acpi0 (BAT0, PNP0C0A-0): ACPI Battery [ 1.0091954] acpiacad0 at acpi0 (AC, ACPI0003-0): ACPI AC Adapter [ 1.0091954] apm0 at acpi0: Power Management spec V1.2 [ 1.0091954] ACPI: Enabled 2 GPEs in block 00 to 07 [ 1.0091954] pckbd0 at pckbc1 (kbd slot) [ 1.0091954] pckbc1: using irq 1 for kbd slot [ 1.0091954] wskbd0 at pckbd0 mux 1 [ 1.0091954] pms0 at pckbc1 (aux slot) [ 1.0091954] pckbc1: using irq 12 for aux slot [ 1.0091954] wsmouse0 at pms0 mux 0 [ 1.0091954] pci0 at mainbus0 bus 0: configuration mode 1 [ 1.0091954] pchb0 at pci0 dev 0 function 0: Intel 82441FX (PMC) PCI and Memory Controller (rev. 0x02) [ 1.0091954] pcib0 at pci0 dev 1 function 0: Intel 82371SB (PIIX3) PCI-ISA Bridge (rev. 0x00) [ 1.0091954] piixide0 at pci0 dev 1 function 1: Intel 82371AB IDE controller (PIIX4) (rev. 0x01) [ 1.0091954] piixide0: primary channel interrupting at irq 14 [ 1.0091954] atabus0 at piixide0 channel 0 [ 1.0091954] piixide0: secondary channel interrupting at irq 15 [ 1.0091954] atabus1 at piixide0 channel 1 [ 1.0091954] vga0 at pci0 dev 2 function 0: VirtualBox Graphics (rev. 0x00) [ 1.0091954] wsdisplay0 at vga0 kbdmux 1 [ 1.0091954] drm at vga0 not configured [ 1.0091954] wm0 at pci0 dev 3 function 0: Intel i82540EM 1000BASE-T Ethernet (rev. 0x02) [ 1.0091954] wm0: interrupting at irq 9 [ 1.0091954] wm0: Ethernet address 08:00:27:d2:84:ac [ 1.0091954] makphy0 at wm0 phy 1: Marvell 88E1011 Gigabit PHY, rev. 4 [ 1.0091954] makphy0: 10baseT, 10baseT-FDX, 100baseTX, 100baseTX-FDX, auto [ 1.0091954] VirtualBox Guest Service (miscellaneous system) at pci0 dev 4 function 0 not configured [ 1.0091954] auich0 at pci0 dev 5 function 0: i82801AA (ICH) AC-97 Audio [ 1.0091954] auich0: interrupting at irq 11 [ 1.0091954] auich0: ac97: SigmaTel STAC9700 codec; no 3D stereo [ 1.0091954] auich0: ac97: ext id 0x809 [ 1.0091954] ohci0 at pci0 dev 6 function 0: Apple Computer Intrepid USB Controller (rev. 0x00) [ 1.0091954] ohci0: interrupting at irq 10 [ 1.0091954] ohci0: OHCI version 1.0 [ 1.0091954] usb0 at ohci0: USB revision 1.0 [ 1.0091954] piixpm0 at pci0 dev 7 function 0: Intel 82371AB (PIIX4) Power Management Controller (rev. 0x08) [ 1.0091954] piixpm0: interrupting at irq 9 [ 1.0091954] iic0 at piixpm0 port 0: I2C bus [ 1.0091954] wm1 at pci0 dev 8 function 0: Intel i82540EM 1000BASE-T Ethernet (rev. 0x02) [ 1.0091954] wm1: interrupting at irq 11 [ 1.0091954] wm1: Ethernet address 08:00:27:95:0b:c1 [ 1.0091954
Re: Limiting malloc to the low 2GB?
On Mon, Nov 28, 2022 at 23:22:08 +, RVP wrote: > On Tue, 29 Nov 2022, Valery Ushakov wrote: > > > Turns out you can use MALLOC_CONF="dss:primary" to make (the new) > > jemalloc prefer sbrk(2). > > Yes, I saw that, but, I wasn't sure if that setting meant a) always > use sbrk() or b) prefer sbrk(), then, fall-back to mmap() (typically > for very large allocations). Me too :) But that works for now. Will need to RTFS. -uwe
Re: Limiting malloc to the low 2GB?
On Mon, Nov 28, 2022 at 21:45:40 +, RVP wrote: > On Mon, 28 Nov 2022, Valery Ushakov wrote: > > > Do we have a way to tell malloc on a 32-bit system to allocate memory > > only below the 2GB boundary (on i386, including when run under amd64)? > > I'm trying to port a(n old) program that wants to use the sign bit for > > its internal purposes. I guess one option would be to prevent malloc > > from using mmap (and disable alsr?) so that only sbrk (in the low 2GB) > > is used. > > The standard jemalloc in the system has a compile-time flag to do this > `--with-lg-vaddr=31'. No run-time setting possible from what I can see. > Or, you could compile the program against the old `src/lib/libbsdmalloc' > which only uses sbrk(). Turns out you can use MALLOC_CONF="dss:primary" to make (the new) jemalloc prefer sbrk(2). He man page documents that you can aslo use const char *malloc_conf = "..."; in your program, but the variable you actually have to use is __je_malloc_conf. There is __weak_alias(malloc_conf, __je_malloc_conf) but that doesn't work across DSO boundaries, I guess. -uwe
Limiting malloc to the low 2GB?
Do we have a way to tell malloc on a 32-bit system to allocate memory only below the 2GB boundary (on i386, including when run under amd64)? I'm trying to port a(n old) program that wants to use the sign bit for its internal purposes. I guess one option would be to prevent malloc from using mmap (and disable alsr?) so that only sbrk (in the low 2GB) is used. Suggestions are appreciated. -uwe
Re: Module autounload proposal: opt-in, not opt-out
On Sun, Aug 07, 2022 at 23:08:47 +, Taylor R Campbell wrote: > Currently there are many types of modules that are autoloaded from > open-ended patterns: Our current auto-load policy is a bit too enthusiastic, IMHO. E.g. an "unknown" ioctl used to (probably still does) trigger autoload of compat module, so each time you run vi you have compat module loaded. One one hand it might seem convenient, but on the other hand we are getting into the "WAT?!" territory, like those dynamically typed languages that go out of their way to interpret your program in *some* way, using most vexing coercions, as if failure is not an option. IMO, it most certainly is an option and often the most sensible one too. Auto-unload seems to me like a kludge to compensate for the blind "let's try and see if this helps" auto-load policy. -uwe
Re: userconf question
On Fri, Aug 05, 2022 at 07:35:17 +, Emmanuel Dreyfus wrote: > menu=Boot normally:rndseed /var/db/entropy-file;boot > menu=Drop to boot prompt:prompt > default=1 > timeout=3 > userconf=disable atabus0 > clear=1 > > But atabus0 is still configured, and wd0 still breaks the boot with > tumeouts. What is wrong with the above userconf syntax? I have very vague memories of userconf, but does your kernel config have explicit atabus0 or catch-all atabus*? IIRC, userconf does not operate on devices (e.g. atabus0 as instantiated from atabus*), but on the configured attachments (i.e. you can only disable atabus0 if you have that line in the config). -uwe
Re: Proposal: Deprecate (or rename) extsrc/
On Fri, Jan 07, 2022 at 12:47:53 +1100, Luke Mewburn wrote: > The "extsrc/" tree was added in late 2009. > Nothing in tree uses "extsrc/"; it's a placeholder for third-party > vendors to hook into the build for their own extensions. > There's no reason a vendor can't just integrate into the build > with local changes - that's what Wasabi Systems did in the early 2000s. > > (Also, if I recall correctly, "extsrc/" without much consultation). > > If some people require the "extsrc/" functionality in tree, then I propose: > 1) A good case to retain the functionality should be made by them. > 2) A better name than "extsrc/" should be chosen, that's not >going to cause completion rage. Maybe "3rdparty"? Yeah, it's kinda there "for the furniture" only. The build.sh usage still claims the default is /usr/extsrc and the makefile do-extsrc target uses hardcoded "extsrc" instead of EXTSRCSRCDIR (sic). Also it's built separately, so one can't easily use it to add say a library that some other 3rd party tool already in src can use, etc. -uwe
Re: wsvt25 backspace key should match terminfo definition
On Tue, Nov 23, 2021 at 18:37:19 -0500, Greg Troxel wrote: > Valery Ushakov writes: > > > vt52 is different. I never used a real vt52 or a clone, but the > > manual at vt100.net gives the following picture: > > > > https://vt100.net/docs/vt52-mm/figure3-1.html > > > > and the description > > > > https://vt100.net/docs/vt52-mm/chapter3.html#S3.1.2.3 > > > > Key CodeAction Taken if Codes Are Echoed > > BACK SPACE010 Backspace (Cursor Left) function > > DELETE177 Nothing > > That is explaining what the terminal does when those codes are sent by > the computer. That is a different thing from how the computer > interprets user input. No. Or rather not only. Please, read the sentence before that table. The "code" column is the code that the terminal transmits when the key is pressed: Table 3-4 lists the function keys, the code they transmit to the host, and the terminal action taken if the code is echoed back to the terminal. > When using a VT52 on Seventh Edition, for example one pushed DEL to > remove the previous character, and the computer woudl send > "" to make it disappear and leave the cursor left. One > basically never pushed BS. It dawned on me that the terminals I used on the pdp-11 clone were (not surprisingly) vt clones and managed to find a picture of the keyboard, which jogged my memory: http://www.leningrad.su/museum/show_big.php?n=1539 so yeah, you would use DEL key on those to correct your typing mistakes. > > But vt200 and later use a different keyboard, lk201 (and i did use a > > real vt220 a lot) > > > > https://vt100.net/docs/vt220-rm/figure3-1.html > > > > that picture is not very good, the one from the vt320 manual is better > > > > https://vt100.net/docs/vt320-uu/chapter3.html > > > > vt220 does NOT have a configuration option that selects the code that > > the > So that is the "DEL" key, not the BS key. See, this is exactly why I said "
Re: wsvt25 backspace key should match terminfo definition
On Tue, Nov 23, 2021 at 19:23:30 +0100, Johnny Billquist wrote: > > But somehow the official terminfo database has kbs=^H for vt220! > > Which is wrong. Exactly. > >kbs=^?, > > Which I think it should be. Amen! (unironically) :) -uwe
Re: wsvt25 backspace key should match terminfo definition
On Tue, Nov 23, 2021 at 09:22:43 -0500, Greg Troxel wrote: > Valery Ushakov writes: > > > On Tue, Nov 23, 2021 at 00:01:40 +, RVP wrote: > > > >> On Tue, 23 Nov 2021, Johnny Billquist wrote: > >> > >> > If something pretends to be a VT220, then the key that deletes > >> > characters to the left should send DEL, not BS... > >> > Just saying... > >> > >> That's fine with me too. As long as things are consistent. I suggested the > >> kernel change because both terminfo definitions (and the FreeBSD console) > >> go for ^H. > > > > Note that the pckbd_keydesc_us keymap maps the scancode of the <- key to > > > > KC(14), KS_Cmd_ResetEmul, KS_Delete, > > > > i.e. 0x7f (^?). > > > > terminfo is obviously incorrect here. Amazingly, the bug is actually > > in vt220 description! wsvt25 just inherits from it: > > > > $ infocmp -1 vt220 | grep kbs > > kbs=^H, > > > > I checkeed termcap.src from netbsd-4 and it's wrong there too. I have > > no idea htf that could have happened. > > I think (memory is getting fuzzy) the problem is that the old terminals > had a delete key, in the upper right, that users use to remove the > previous character, and a BS key, upper left, that was actually a > carriage control character. [... snip ...] > I see the same kbs=^H on vt52. vt52 is different. I never used a real vt52 or a clone, but the manual at vt100.net gives the following picture: https://vt100.net/docs/vt52-mm/figure3-1.html and the description https://vt100.net/docs/vt52-mm/chapter3.html#S3.1.2.3 Key CodeAction Taken if Codes Are Echoed BACK SPACE010 Backspace (Cursor Left) function DELETE177 Nothing vt100 had similar keyboard (again, never used a real one personally) https://vt100.net/docs/vt100-ug/chapter3.html#F3-2 BACKSPACE 010 Backspace function DELETE177 Ignored by the VT100 But vt200 and later use a different keyboard, lk201 (and i did use a real vt220 a lot) https://vt100.net/docs/vt220-rm/figure3-1.html that picture is not very good, the one from the vt320 manual is better https://vt100.net/docs/vt320-uu/chapter3.html vt220 does NOT have a configuration option that selects the code that the https://vt100.net/docs/vt320-uu/chapter4.html#S4.13 For vt320 (where it *is* configurable) terminfo has $ infocmp -1 vt320 | grep kbs kbs=^?, > I think the first thing to answer is "what is kbs in terminfo supposed > to mean". X/Open Curses, Issue 7 doesn't explain, other than saying "backspace" key, which is an unfortunate name, as it's loaded. But it's sufficiently clear from the context that it's the key that deletes backwards, i.e. My other question is how kbs is used from terminfo. Is it about > generating output sequences to move the active cursor one left? If so, > it's right. Is it about "what should the user type to delete left", > then for a vt52/vt220, that's wrong. If it is supposed to be both, > that's an architectural bug as those aren't the same thing. No, k* capabilities are sequences generated by the terminal when some key is pressed. The capability for the sequence sent to the the terminal to move the cursor left one position is cub1 $ infocmp -1 vt220 | grep cub1 cub1=^H, kcub1=\E[D, (kcub1 is the sequence generated by the left arrow _k_ey). -uwe
Re: wsvt25 backspace key should match terminfo definition
On Tue, Nov 23, 2021 at 00:01:40 +, RVP wrote: > On Tue, 23 Nov 2021, Johnny Billquist wrote: > > > If something pretends to be a VT220, then the key that deletes > > characters to the left should send DEL, not BS... > > Just saying... > > That's fine with me too. As long as things are consistent. I suggested the > kernel change because both terminfo definitions (and the FreeBSD console) > go for ^H. Note that the pckbd_keydesc_us keymap maps the scancode of the <- key to KC(14), KS_Cmd_ResetEmul, KS_Delete, i.e. 0x7f (^?). terminfo is obviously incorrect here. Amazingly, the bug is actually in vt220 description! wsvt25 just inherits from it: $ infocmp -1 vt220 | grep kbs kbs=^H, I checkeed termcap.src from netbsd-4 and it's wrong there too. I have no idea htf that could have happened. -uwe
Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl
On Wed, Oct 27, 2021 at 20:59:12 -0700, Jason Thorpe wrote: > > On Oct 27, 2021, at 4:01 PM, Jason Thorpe wrote: > > > > > >> On Oct 27, 2021, at 3:44 PM, Valery Ushakov wrote: > >> > >> On Wed, Oct 27, 2021 at 07:50:55 -0700, Jason Thorpe wrote: > >> > >> I was wondering if it might be easier to not put the onus onto the > >> caller and instead have a function that returns the interrupted > >> ucontext (or NULL, if the pc is not in a trampoline). > >> > >> ucontext_t *__unwind_sigtramp(return_pc, return_sp) > > > > That would certainly be a nicer API. > > Thought about it a little more. > > To make this really work, we'd definitely have to version > sigaction() so that it fully de-supported sigcontext handlers. > Otherwise, it's a toss-up whether you have a sigcontext or a > ucontext on the stack. It is ucontext for the siginfo trampoline and sigcontext for the older one, isn't it? > I also see some value in the basic check (which you need to have to > __sigtramp_unwind() anyway?). You need internally, obviously, but the only useful thing you can do with it is get the interrupted context that the trampoline would restore. Or do I miss something here (which is entirely possible, as I'm writing all these remarks from fading memories of implementing the trampolines for sh3). -uwe
Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl
On Wed, Oct 27, 2021 at 07:50:55 -0700, Jason Thorpe wrote: > > On Oct 18, 2021, at 9:41 AM, John Marino (NetBSD) wrote: > > > > yes, it sounds like a __in_signal_trampoline function would work for > > the GCC unwind, and I would think it would work for GDB as well. > > Ok, I have implemented a new function with this signature: > > /* > * __sigtramp_check_np -- > * > * Non-portable function that checks if the specified program > * counter value is within the signal return trampoline. Returns > * the trampoline version numnber corresponding to what style of > * trampoline it matches, or -1 if the program value is not within > * the signal return trampoline. > */ > int __sigtramp_check_np(void *pc); > > Usage would be like: [... lots of code ...] I was wondering if it might be easier to not put the onus onto the caller and instead have a function that returns the interrupted ucontext (or NULL, if the pc is not in a trampoline). ucontext_t *__unwind_sigtramp(return_pc, return_sp) -uwe
Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl
On Mon, Oct 18, 2021 at 10:41:48 -0500, John Marino (NetBSD) wrote: > How we did it with libc before is shown in the netbsd-unwind.h link in > the original post. This technique looks for __sigtramp_siginfo_2 > assembly code but no longer works. I don't know how to do this any > other way. GDB doesn't either, it uses the debug information to match > the function name __sigtramp_siginfo_2 and I am not even sure that's > valid for current NetBSD releases based on what we've learned here. Didn't kamil@ fixed this a while back? E.g. for amd64: revision 1.8 date: 2020-10-12 20:55:54 +0300; author: kamil; state: Exp; lines: +29 -3; commitid: sz57gQtWi3mGKDrC; Decorate the x86_64 signal trampoline with CFI attributes easing unwinding Combine the approach provided by Nikhil Benesch and Andrew Cagney. Now, the unwinders (in gccgo, backtrace(3), etc) can unwind properly the stack from a signal handler. Fixes lib/55719 by Nikhil Benesch -uwe
Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl
On Fri, Oct 15, 2021 at 23:14:39 +0300, Valery Ushakov wrote: > On Fri, Oct 15, 2021 at 14:44:16 -0500, John Marino (NetBSD) wrote: > > > Is it possible for NetBSD to implement KERN_PROC_SIGTRAMP sysctl? > > It's been ages since I touched this area, but don't we have > per-sigaction trampolines? I mean, in practice they all use the same > __sigtramp_siginfo_$version trampoline, that sigaction passes to the > actual syscall, but in principle the process can have different > trampolines for different signals, can't it? > > struct sys___sigaction_sigtramp_args { > syscallarg(int) signum; > syscallarg(const struct sigaction *) nsa; > syscallarg(struct sigaction *) osa; > syscallarg(const void *) tramp; // <- > syscallarg(int) vers; > }; PS: We used to have a trampoline that the kernel copied out into the process address space (bottom of the stack, iirc) - and that would be something for KERN_PROC_SIGTRAMP to return indeed. But that was like before netbsd 2.0, iirc. -uwe
Re: Request for implementation of KERN_PROC_SIGTRAMP sysctl
On Fri, Oct 15, 2021 at 14:44:16 -0500, John Marino (NetBSD) wrote: > Is it possible for NetBSD to implement KERN_PROC_SIGTRAMP sysctl? > > TLDR; > For several years, the GNAT Ada compiler has not been able to unwind a > stack containing a signal trampoline. The unwinder I wrote for gcc > several years ago just stopped working on newer NetBSD release even > though the signal trampoline code itself did not change. FreeBSD and > DragonFly BSD are immune to sigtramp location changes because they've > introduced the KERN_PROC_SIGTRAMP sysctl which provides the location > of the signal tramp of the process. It's been ages since I touched this area, but don't we have per-sigaction trampolines? I mean, in practice they all use the same __sigtramp_siginfo_$version trampoline, that sigaction passes to the actual syscall, but in principle the process can have different trampolines for different signals, can't it? struct sys___sigaction_sigtramp_args { syscallarg(int) signum; syscallarg(const struct sigaction *) nsa; syscallarg(struct sigaction *) osa; syscallarg(const void *) tramp; // <- syscallarg(int) vers; }; -uwe
Re: Level for Unix-domain socket options
On Thu, Aug 05, 2021 at 22:55:12 +0200, Rhialto wrote: > On Thu 05 Aug 2021 at 13:22:55 +, nia wrote: > > The unix(4) man page incorrectly states: > > > > "A UNIX-domain socket supports two socket-level options for use with > > setsockopt(2) and getsockopt(2): [...]" > > > > In reality, the protocol level when using these socket options > > must be 0, which is a magic number not really documented anywhere > > except the test suite. > > and getsockopt(2) says > > DESCRIPTION > getsockopt(), setsockopt() and getsockopt2() manipulate the options > associated with a socket. Options may exist at multiple protocol levels; > they are always present at the uppermost "socket" level. > > which I interpret to mean that even if you use SOL_SOCKET for these > options, it should work. Do I read that as intended? Was that perhaps an artifact of an old implementation? POSIX says The getsockopt() function shall fail if: [EINVAL] The specified option is invalid at the specified socket level. [ENOPROTOOPT] The option is not supported by the protocol. while our man page only has [ENOPROTOOPT] The option is unknown at the level indicated. which might actually be problematic, but I will leave the exegetic exercise to someone more skilled. -uwe
Re: protect pmf from network drivers that don't provide if_stop
On Thu, Jul 01, 2021 at 06:47:08 -0300, Jared McNeill wrote: > Not really a fan of this as it doesn't protect other potential if_stop users > (and "temporary fix" rarely is..). How about something like this instead? I agree. If for whatever reason we really insist on if-specific stop, then just use a different stub that would complain about "pure virtual" method called, kassert, or whatever. > --- sys/net/if.c 29 Jun 2021 21:19:58 - 1.486 > +++ sys/net/if.c 1 Jul 2021 09:46:10 - > @@ -761,11 +761,13 @@ void > if_register(ifnet_t *ifp) > { > /* > - * If the driver has not supplied its own if_ioctl, then > - * supply the default. > + * If the driver has not supplied its own if_ioctl or if_stop, > + * then supply the default. >*/ > if (ifp->if_ioctl == NULL) > ifp->if_ioctl = ifioctl_common; > + if (ifp->if_stop == NULL) > + ifp->if_stop = if_nullstop; > > sysctl_sndq_setup(>if_sysctl_log, ifp->if_xname, >if_snd); -uwe
Re: Inconsistencies in usage of "locators" argument to config (*ca_rescan)() functions
On Fri, Mar 26, 2021 at 13:18:16 -0700, Jason Thorpe wrote: > I think it may have been the terminology used by Chris Torek in his > paper on the new 4.4BSD device auto configuration framework [...]. > Sadly, that paper is somewhat hard to find, and I don't know if it > was ever actually published anywhere. This one? :) http://www.netbsd.org/docs/kernel/config-torek.ps -uwe
Re: checking for a closed socket
On Tue, Feb 02, 2021 at 19:20:22 +0100, Manuel Bouyer wrote: > I've been debugging an issue wuth Xen, where xenstored loops at 100% > CPU on poll(2). > after code analysis it's looping on closed Unix socket desriptors. > From what I understood the code expect poll(2) to return something > different from POLLIN when the remote end of the socket is > closed (it checks for (~(POLLOUT|POLLIN)) to it could be either > POLLERR or POLLHUP I guess - or eventually POLLRDHUP which we don't have). > > Who is right here, linux or NetBSD (linux claims to be posix, while > our man page doens't mention it) ? > > Is there a way to check if a connection has been closed without a read() ? You have to be careful what you read into "claim to be posix", especially when connection creation and termination are concerned. Termination is extra fun because there are half-closed sockets. My experience is that the only thing you can rely on is that if POLLFOO is reported for an fd then the "foo" action on that fd will not block - which is, essentially, poll's principal raison d'etre. The details can vary wildly from system to system, so you might need some strategic planning and experimentation. I don't have all my relevant notes handy, but as an example, consider a failed connect(2) that you poll for POLLOUT (posix: "A file descriptor for a socket that is connecting asynchronously shall indicate that it is ready for writing, once a connection has been established."). On failed connect(2) you will get: - NetBSD, Solaris: POLLOUT - Linux: POLLERR | POLLHUP | POLLOUT - MacOS: POLLHUP POLLHUP on "close" is even more fun because of half-closed connections. NetBSD and Solaris never report POLLHUP for sockets, MacOS reports POLLHUP when remote closes, Linux reports POLLHUP when both directions are closed. Note that getting POLLHUP doesn't mean that you can immediately "give up" on that socket, you still have to read it b/c there may still be unread data. E.g. consider sending a request, half-closing your side, getting a reply from the server that ends up in the kernel's socket buffer followed by the server half-closing its end and thus completely closing the connection. At this point you haven't read anything yet in the application, but you will get POLLHUP (and POLLIN for the data, iirc). So that POLLHUP is not really telling you much. All of the above is strictly "IIRC" and might have changed since the last time I checked. To reiterate, my point is that 1) you can assume very little about specific events reported for boundary conditions - different systems report them differently; 2) you have to remember that the main promise of the poll(2) is that the corresponding operation will not block. PS: Sorry if that was a bit on the rambling side. -uwe
enet(4) problem? (Was: NFS client performance problems)
TL;DR: looks like a problem in enet(4) On Fri, Dec 25, 2020 at 02:20:13 +0300, Valery Ushakov wrote: > I've stumbled into a weird performance problem with NFS client. I > have CompuLab's Utilite Pro (evbarm) machine running a very current > -current. It's connected to my Ubuntu laptop with Ethernet (direct > cable link) and uses it as an NFS server to get e.g. pkgsrc tree and > distfiles. The performance is really bad, e.g. make extract of a > package may take literal ages (I did a lot of the initial > investigation for this mail while python distfile was being > extracted). Extracting 31M uncompressed bash distfile may take, from > run to run: > > real2m21.110s > user0m0.635s > sys 0m4.233s > > or > > real4m52.010s > user0m0.769s > sys 0m4.815s > > or whatever. > > Looking at the traffic with wireshark I can see a curious recurring > pattern. Looking at the time/sequence plot one immediately sees short > bursts of activity separated by huge gaps of no traffic. > > Here's one such instance abridged and wrapped for clarity. Timestamps > are shown as delta to the previous frame. Window scale is 3 for the > client (utilite) and 7 for the server (majava), i.e the server has > plenty of window open. > > > 413 00.000351 IP utilite.1011 > majava.nfs: Flags [.], > seq 177121:178569, ack 79601, win 3225, > options [nop,nop,TS val 111 ecr 1941833655], > length 1448: > NFS request xid 1992059772 1444 write fh ... 5406 (5406) bytes > @ 16384 > > 414 00.48 IP utilite.1011 > majava.nfs: Flags [.], > seq 178569:180017, ack 79601, win 3225, > options [nop,nop,TS val 111 ecr 1941833655], > length 1448 > 415 00.09 IP majava.nfs > utilite.1011: Flags [.], > ack 180017, win 1834, > options [nop,nop,TS val 1941833656 ecr 111], > length 0 > > 416 00.51 IP utilite.1011 > majava.nfs: Flags [.], > seq 180017:181465, ack 79601, win 3225, > options [nop,nop,TS val 111 ecr 1941833655], > length 1448 > 417 00.043745 IP majava.nfs > utilite.1011: Flags [.], > ack 181465, win 1834, > options [nop,nop,TS val 1941833700 ecr 111], > length 0 > > 418 00.994813 IP utilite.1011 > majava.nfs: Flags [P.], > seq 181465:182645, ack 79601, win 3225, > options [nop,nop,TS val 111 ecr 1941833655], > length 1180 > 419 00.32 IP majava.nfs > utilite.1011: Flags [.], > ack 182645, win 1834, > options [nop,nop,TS val 1941834694 ecr 111], > length 0 > ! 420 00.07 IP utilite.1011 > majava.nfs: Flags [P.], > seq 181465:182645, ack 79601, win 3225, > options [nop,nop,TS val 113 ecr 1941833700], > length 1180 > 421 00.09 IP majava.nfs > utilite.1011: Flags [.], > ack 182645, win 1834, > options [nop,nop,TS val 1941834694 ecr 113,nop,nop, >sack 1 {181465:182645}], > length 0 > > Here frames 413, 414, 416 and 418 comprise single NFS write request. > All, but the last segments are sent very fast. Then note that the > last segment (418) is sent after a 1 second delay, and then > immediately resent (420, marked as "spurious retransmission" by > wireshark). > > This pattern repeats through the trace. From time to time a large > write has its last segment delayed by 1 second, then there's an ACK > from the server and then that last segment is immediately "spuriously" > resent. > > Does this ring any bells? Either from the TCP point of view or from > NFS might be doing here that might trigger that. > > Just copying a large file to the server seems to be ok, the > time/sequence plot is nice and linear. Looking at the same traffic from the client I see a different picture that complements the server side story. The NFS write request is sent in one batch of N chunks. The server acks chunks up to N-1, but not the last one. After one second the client retransmits the last chunk of the batch and gets two acks on it. So what seems to be happenning is that enet(4) does not actually transmit the packet with the last chunk of data (418) along with its siblings (413, 414, 416). The server does not see it and so does not ack it. Eventually the rexmit timer kicks in and the client's TCP resends the last chunk in a new packet (420). At this point the old last packet (418) gets unstuck and so both the original last packet and the new copy are sent. So this seems to be some kind of enet(4) bug. mlelstv@ pointed out that TXDESC_WRITEOUT and RXDESC_WRITEOUT use PREWRITE, not POSTWRITE, which seems suspicious. Unfortunately changing them to POSTWRITE doesn't seem to help. Since this problem doesn't happen with all long WRITEs there must be something else at play here too. If i have to guess - ringbuffer wraparound may be? -uwe
NFS client performance problems
I've stumbled into a weird performance problem with NFS client. I have CompuLab's Utilite Pro (evbarm) machine running a very current -current. It's connected to my Ubuntu laptop with Ethernet (direct cable link) and uses it as an NFS server to get e.g. pkgsrc tree and distfiles. The performance is really bad, e.g. make extract of a package may take literal ages (I did a lot of the initial investigation for this mail while python distfile was being extracted). Extracting 31M uncompressed bash distfile may take, from run to run: real2m21.110s user0m0.635s sys 0m4.233s or real4m52.010s user0m0.769s sys 0m4.815s or whatever. Looking at the traffic with wireshark I can see a curious recurring pattern. Looking at the time/sequence plot one immediately sees short bursts of activity separated by huge gaps of no traffic. Here's one such instance abridged and wrapped for clarity. Timestamps are shown as delta to the previous frame. Window scale is 3 for the client (utilite) and 7 for the server (majava), i.e the server has plenty of window open. > 413 00.000351 IP utilite.1011 > majava.nfs: Flags [.], seq 177121:178569, ack 79601, win 3225, options [nop,nop,TS val 111 ecr 1941833655], length 1448: NFS request xid 1992059772 1444 write fh ... 5406 (5406) bytes @ 16384 > 414 00.48 IP utilite.1011 > majava.nfs: Flags [.], seq 178569:180017, ack 79601, win 3225, options [nop,nop,TS val 111 ecr 1941833655], length 1448 415 00.09 IP majava.nfs > utilite.1011: Flags [.], ack 180017, win 1834, options [nop,nop,TS val 1941833656 ecr 111], length 0 > 416 00.51 IP utilite.1011 > majava.nfs: Flags [.], seq 180017:181465, ack 79601, win 3225, options [nop,nop,TS val 111 ecr 1941833655], length 1448 417 00.043745 IP majava.nfs > utilite.1011: Flags [.], ack 181465, win 1834, options [nop,nop,TS val 1941833700 ecr 111], length 0 > 418 00.994813 IP utilite.1011 > majava.nfs: Flags [P.], seq 181465:182645, ack 79601, win 3225, options [nop,nop,TS val 111 ecr 1941833655], length 1180 419 00.32 IP majava.nfs > utilite.1011: Flags [.], ack 182645, win 1834, options [nop,nop,TS val 1941834694 ecr 111], length 0 ! 420 00.07 IP utilite.1011 > majava.nfs: Flags [P.], seq 181465:182645, ack 79601, win 3225, options [nop,nop,TS val 113 ecr 1941833700], length 1180 421 00.09 IP majava.nfs > utilite.1011: Flags [.], ack 182645, win 1834, options [nop,nop,TS val 1941834694 ecr 113,nop,nop, sack 1 {181465:182645}], length 0 Here frames 413, 414, 416 and 418 comprise single NFS write request. All, but the last segments are sent very fast. Then note that the last segment (418) is sent after a 1 second delay, and then immediately resent (420, marked as "spurious retransmission" by wireshark). This pattern repeats through the trace. From time to time a large write has its last segment delayed by 1 second, then there's an ACK from the server and then that last segment is immediately "spuriously" resent. Does this ring any bells? Either from the TCP point of view or from NFS might be doing here that might trigger that. Just copying a large file to the server seems to be ok, the time/sequence plot is nice and linear. -uwe
Re: autoloading compat43 on tty ioctls
On Sat, Oct 10, 2020 at 11:49:47 -0700, Paul Goyette wrote: > True, but the way ioctl's are handled in kern/tty.c seems to auto-load > the compat_43 and compat_60 modules for _any_ unhandled ioctl. So if > you have an illegal/invalid ioctl it will autoload the modules, and then > unload them 10 seconds later. > > I question whether we should do the autoloads... I think I mentioned exectly this problem in Lillehammer - I noticed it a few years ago b/c newever binutils started using some reloc type or other that sh3 kobj was not prepared to handle, so you'd see a kernel message on each (auto)load. -uwe
Re: "Boot this kernel once" functionality? (amd64)
On Wed, Sep 16, 2020 at 12:20:57 +0200, Anthony Mallet wrote: > On Wednesday 16 Sep 2020, at 12:09, Martin Husemann wrote: > > This works fine on e.g. sparc*; I can do: shutdown -b netbsd.t -r > > now > > > > No state is modified on any disks, very convenient. > > Right, not changing any state seems safer! > > > I don't know if there is enough of a persistent environment for UEFI > > boots (I would guess there is), and probably no easy way for BIOS > > boot. > > The machine in question is not UEFI, so I would be more interested in > a pure BIOS solution. As der Mouse mentioned upthread, kloader(4) would seem like a promising candidate to implement this. It doesn't support x86 currently, but existing kloader_machdep.c files are minuscule - the non-boilerplate code is essentially just one function that is essentially a bit more than a fancy memcpy. The realy interesting question is if NetBSD on a given platform leaves the machine in a state that a newly booted kernel expects the machine to be in. The hpc* ports that support kloader do not expect anything much from the initial state of the machine. Of course that doesn't suit your immediate needs... -uwe
-.su file in kernel compile dir
Recent changes to record stack usage cause a file named -.su to be created (that refers assym.c). It plays tricks with targets like clean that refer to *.su -uwe
Re: Submitting a new module example
On Tue, Jun 02, 2020 at 00:02:29 +, bmelo wrote: > I have written a ddb_hello module example. You can found the patch here: > https://pastebin.com/WCUpRc0J > > Could it be imported in src even if there is a ddbping example, please? It's the same skeleton code that ddbping already demoes (a bit less, actually), so I don't think there's a reason to have a duplicate example. Sorry I've beaten you to it, I had no idea. -uwe
Re: Rump makes the kernel problematically brittle
On Thu, Apr 02, 2020 at 23:29:55 +0300, Valery Ushakov wrote: > On Thu, Apr 02, 2020 at 16:15:30 -0400, Mouse wrote: > > > > http://www.fixup.fi/misc/rumpkernel-book/ > > > > That page I can look at fine, but when I try to fetch the PDF, I get a > > 403 Forbidden. In case it helps anyone, the body says > > > > Code: AccessDenied > > Message: Access Denied > > RequestId: CE223007341C4B9F > > HostId: > > iIeEi7wGEkGET4V/Pw2ndjkjrChsKswcqoLJJpmExJOrqdRFFgHw6L6XWjB2ZSNqBTTXyPHJYMI= > > Works with firefox. It porbably needs javascript or cookies or > whatever. It wants a referrer. Go to that page with lynx and you can download the pdf by following the link. -uwe
Re: Rump makes the kernel problematically brittle
On Thu, Apr 02, 2020 at 16:15:30 -0400, Mouse wrote: > > http://www.fixup.fi/misc/rumpkernel-book/ > > That page I can look at fine, but when I try to fetch the PDF, I get a > 403 Forbidden. In case it helps anyone, the body says > > Code: AccessDenied > Message: Access Denied > RequestId: CE223007341C4B9F > HostId: > iIeEi7wGEkGET4V/Pw2ndjkjrChsKswcqoLJJpmExJOrqdRFFgHw6L6XWjB2ZSNqBTTXyPHJYMI= Works with firefox. It porbably needs javascript or cookies or whatever. > (While I recognize you may not be the person to say this to, denying > access like that without any indication of what the problem is or whom > to ask for help is...singularly useless.) This is so *richly* ironic coming from you :) I've given up trying to send you personal mail years ago. No, please, don't answer that, I'm no longer interested. Re whom to contact, there's a contact email literally right beneath that link. Not sure if your MX will agree to accept the reply though. :) PS: Sorry, I still can't stop giggling... PPS: Sorry... :) PPPS: *giggle* -uwe
Re: Rump makes the kernel problematically brittle
On Fri, Apr 03, 2020 at 02:23:31 +0700, Robert Elz wrote: > | Is this documented anywhere? > > You're putting documented and rump into the same thought space? http://www.fixup.fi/misc/rumpkernel-book/ -uwe
sys_ptrace_lwpstatus.c (Was: CVS commit: src/sys)
On Thu, Dec 26, 2019 at 08:52:39 +, Kamil Rytarowski wrote: > Module Name: src > Committed By: kamil > Date: Thu Dec 26 08:52:39 UTC 2019 > > Modified Files: > src/sys/kern: files.kern sys_ptrace_common.c > src/sys/sys: ptrace.h > Added Files: > src/sys/kern: sys_ptrace_lwpstatus.c > > Log Message: > Put ptrace_read_lwpstatus() and process_read_lwpstatus() to a new file > > Fixes "no PTRACE" kernel build, in particular zaurus kernel=INSTALL_C700. This is counterintuitive when a sys_ptrace* file with ptrace_* functions does not depend on options ptrace. That seems to be a strong indication the functions and the file are misnamed. filekern/sys_ptrace.c ptrace filekern/sys_ptrace_common.cptrace filekern/sys_ptrace_lwpstatus.c kern -uwe
xc_barrier()
gcc 8 -Wcast-function-type (enabled by -Wextra that we do turn on for x86 ports and a few others) is not very happy about many function casts for nullop and friends in the kernel. A small portion of them is code that does xcall barrier with: uint64_t where; where = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL); xc_wait(where); The attached patch replaces all these with xc_barrier(0); with obvious implementation. Suggestions for a better name and (especially) for the descriptive comment and the man-page text are welcome. -uwe Index: sys/xcall.h === RCS file: /cvsroot/src/sys/sys/xcall.h,v retrieving revision 1.7 diff -u -p -r1.7 xcall.h --- sys/xcall.h 27 Aug 2018 07:10:15 - 1.7 +++ sys/xcall.h 6 Oct 2019 12:28:38 - @@ -53,6 +53,8 @@ uint64_t xc_broadcast(u_int, xcfunc_t, v uint64_t xc_unicast(u_int, xcfunc_t, void *, void *, struct cpu_info *); void xc_wait(uint64_t); +void xc_barrier(u_int); + unsigned int xc_encode_ipl(int); #endif /* _KERNEL */ Index: kern/subr_xcall.c === RCS file: /cvsroot/src/sys/kern/subr_xcall.c,v retrieving revision 1.26 diff -u -p -r1.26 subr_xcall.c --- kern/subr_xcall.c 7 Feb 2018 04:25:09 - 1.26 +++ kern/subr_xcall.c 6 Oct 2019 12:28:37 - @@ -247,6 +247,30 @@ xc_init_cpu(struct cpu_info *ci) KASSERT(error == 0); } + +static void +xc_nop(void *arg1, void *arg2) +{ + +return; +} + + +/* + * xc_barrier: + * + * Broadcast a nop to all CPUs in the system. + */ +void +xc_barrier(unsigned int flags) +{ + uint64_t where; + + where = xc_broadcast(flags, xc_nop, NULL, NULL); + xc_wait(where); +} + + /* * xc_broadcast: * Index: arch/x86/acpi/acpi_cpu_md.c === RCS file: /cvsroot/src/sys/arch/x86/acpi/acpi_cpu_md.c,v retrieving revision 1.79 diff -u -p -r1.79 acpi_cpu_md.c --- arch/x86/acpi/acpi_cpu_md.c 10 Nov 2018 09:42:42 - 1.79 +++ arch/x86/acpi/acpi_cpu_md.c 6 Oct 2019 12:28:35 - @@ -378,7 +378,6 @@ acpicpu_md_cstate_stop(void) { static char text[16]; void (*func)(void); - uint64_t xc; bool ipi; x86_cpu_idle_get(, text, sizeof(text)); @@ -393,8 +392,7 @@ acpicpu_md_cstate_stop(void) * Run a cross-call to ensure that all CPUs are * out from the ACPI idle-loop before detachment. */ - xc = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL); - xc_wait(xc); + xc_barrier(0); return 0; } Index: kern/kern_lwp.c === RCS file: /cvsroot/src/sys/kern/kern_lwp.c,v retrieving revision 1.204 diff -u -p -r1.204 kern_lwp.c --- kern/kern_lwp.c 3 Oct 2019 22:48:44 - 1.204 +++ kern/kern_lwp.c 6 Oct 2019 12:28:37 - @@ -367,7 +367,6 @@ static void lwp_dtor(void *arg, void *obj) { lwp_t *l = obj; - uint64_t where; (void)l; /* @@ -379,8 +378,7 @@ lwp_dtor(void *arg, void *obj) * the value of l->l_cpu must be still valid at this point. */ KASSERT(l->l_cpu != NULL); - where = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL); - xc_wait(where); + xc_barrier(0); } /* Index: kern/kern_ras.c === RCS file: /cvsroot/src/sys/kern/kern_ras.c,v retrieving revision 1.38 diff -u -p -r1.38 kern_ras.c --- kern/kern_ras.c 4 Jul 2016 07:56:07 - 1.38 +++ kern/kern_ras.c 6 Oct 2019 12:28:37 - @@ -66,9 +66,7 @@ ras_sync(void) /* No need to sync if exiting or single threaded. */ if (curproc->p_nlwps > 1 && ncpu > 1) { #ifdef NO_SOFTWARE_PATENTS - uint64_t where; - where = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL); - xc_wait(where); + xc_barrier(0); #else /* * Assumptions: Index: kern/kern_softint.c === RCS file: /cvsroot/src/sys/kern/kern_softint.c,v retrieving revision 1.47 diff -u -p -r1.47 kern_softint.c --- kern/kern_softint.c 17 May 2019 03:34:26 - 1.47 +++ kern/kern_softint.c 6 Oct 2019 12:28:37 - @@ -407,7 +407,6 @@ softint_disestablish(void *arg) softcpu_t *sc; softhand_t *sh; uintptr_t offset; - uint64_t where; u_int flags; offset = (uintptr_t)arg; @@ -432,8 +431,7 @@ softint_disestablish(void *arg) * SOFTINT_ACTIVE already set. */ if (__predict_true(mp_online)) { - where = xc_broadcast(0, (xcfunc_t)nullop, NULL, NULL); - xc_wait(where); + xc_barrier(0); } for (;;) { Index: kern/kern_syscall.c
Re: Proposal, again: Disable autoload of compat_xyz modules
On Fri, Sep 27, 2019 at 11:36:08 -, Christos Zoulas wrote: > >} I propose something very slightly different that can preserve the current > >} functionality with user action: > >} > >} 1. Remove them from standard kernels in architectures where modules are > >}supported. Users can add them back or just use modules. > >} 2. Disable autoloading, but provide a sysctl to enable autoloading > >}(1 global sysctl for all compat modules). Users can change the default > >}in /etc/sysctl.conf (adds sysctl to the proposal) > > > > You mean this (first line): > > > >i386devel: {31} sysctl kern.module > >kern.module.autoload = 0 > >kern.module.verbose = 0 > >kern.module.path = /stand/amd64-xen/8.99.26/modules > >kern.module.autotime = 10 > > Perhaps: > > kern.module.autoload.disable = linux,linux32 May be we should take a look at how SNMP did tables in MIB, b/c we are trying to create just such a table indexed by module name. Also, I'm not that sure about autoload of compat stuff especially since iirc it currently implies auto-unload too. I vaguely remember when I was debugging something in sh3 kobj_machdep.c I had some printfs there that made the autoloads visibile and (iirc) each vi invocation would trigger an autoload of compat ioctl code (which wouldn't recognize the ioctl, and that would be auto-unloaded a few seconds later). -uwe
Re: Proposal, again: Disable autoload of compat_xyz modules
On Fri, Sep 27, 2019 at 10:57:12 +0200, Jarom?r Dole?ek wrote: > Le jeu. 26 sept. 2019 ? 18:08, Manuel Bouyer a ?crit > : > > > > On Thu, Sep 26, 2019 at 05:10:01PM +0200, Maxime Villard wrote: > > > issues for a clearly marginal use case, and given the current general > > ^^^ > > > > This is where we dissagree. You guess it's marginal but there's no > > evidence of that (and there's no evidence of the opposite either). > > FYI - I've put also a lot of efford into fixing & enhancing > compat_linux in past. I also greatly appreciate all the work work of > other folks working on the layer, it's super useful in some situations > - browser with flash support used to be important (thankfully not > anymore), also vmware and matlab, I also used some Oracle dev tools. > However, that is not the topic of the discussion. > > Let's concentrate on whether it should be enabled by default. Yes, please. This discussion has veered way off topic. > Given the history, to me it's completely clear compat_linux shouldn't > be on by default. Any possible linux-specific exploits should only be > problem for people actually explicitly enabling it. Let's just stop > pretending that we'd setup any kind of reasonable testing suite for > this - it has not been done in last >20 years, it's even less likely > to happen now that most of the major use cases are actually moot. > > As Maya suggested, let's keep this concentrated on COMPAT_LINUX only > to avoid further bikeshed flogging, so basically I propose doing this: > 1) Comment out COMPAT_LINUX from all kernels configs for all archs > which support modular > 2) Disable autoload for compat_linux, requiring the user to explicitly > configure system to load it. No extra sysctl. > > Any major and specific objections? At some point it became very hard to follow the technical content of this thread, but I don't think there were any. Thanks! -uwe
Re: mknod(2) and POSIX
On Tue, Jun 18, 2019 at 17:22:14 +0200, Kamil Rytarowski wrote: > I wrote a patch to add support for it, but untested as currently the > kernel build is broken: > > http://netbsd.org/~kamil/patch-00128-posix-mknod.txt > > Independently, I have removed unused variable retval. > > If this patch is fine and once the kernel will be unbroken, I can land > it, document and add ATF tests. Please, please, please, don't mix unrelated changes. If retval is unused already, g/c it first in a separate commit. -uwe
Re: mknod(2) and POSIX
On Tue, Jun 18, 2019 at 14:30:26 +0200, Jason Thorpe wrote: > > On Jun 18, 2019, at 2:25 PM, Jason Thorpe wrote: > > > >> On Jun 18, 2019, at 2:01 PM, Greg Troxel wrote: > >> > >> I realize mkfifo is preferred in our world, and POSIX says it is > >> preferred. But I believe we have a failure to follow POSIX. > >> > >> Other opinions? > > > > Seems you are correct. > > Sorry! Hit "send" prematurely. > > mknod(2) for the FIFO case should allow users under the same > circumstances that mkfifo(2) does. Since our mknod() is a wrapper, we can trivially dispath to mkfifo syscall for mknod calls with S_IFIFO, can't we? I don't think we should make the mknod syscall itself to support this. -uwe
Re: fork-the-syscall return semantics
On Sat, Feb 16, 2019 at 20:14:35 -0500, Mouse wrote: > In fork1(), in kern/kern_fork.c, there is code > > /* > * Return child pid to parent process, > * marking us as parent via retval[1]. > */ > if (retval != NULL) { > retval[0] = p2->p_pid; > retval[1] = 0; > } > > This is very old code; identical code appears as far back as 1.4T, > quite likely even farther back. It appears the return semantics of > fork-the-syscall-trap (and related calls, like __vfork14) are a bit > odd: the parent returns and the child returns (or > at least so a comment in the SPARC libc wrapper claims; I haven't dug > enough to find the kernel code where the child's return values are set > up). But I see no reason for this, as the libc wrapper immediately > destroys the first return value in the child. > > Does anyone happen to know why this was done? So far I haven't found > any reason to not simply return the abstract return value in retval[0] > like most other syscalls that return a simple integer value, but for a > special case like this to have survived this long, I can't help feeling > there must be _something_ behind it. I would look at http://mail-index.netbsd.org/source-changes/1995/12/10/msg012114.html Modified Files: init_main.c Log Message: Change the way we test whether or not we're in the child process. except there seems to be no such commit actually recorded in init_main.c log :) Was reverted in the repo, I guess. The code before that change looks like: #ifdef cpu_set_init_frame /* XXX should go away */ if (rval[1]) { /* * Now in process 2. */ start_pagedaemon(curproc); } #else Its counterpart is http://mail-index.netbsd.org/source-changes/1995/12/10/msg012115.html Modified Files: kern_fork.c Log Message: If __FORK_BRAINDAMAGE, continue stuffing retval[1] for the benefit of main(). Other relevant commits are probably: http://mail-index.netbsd.org/source-changes/1995/12/09/.html http://mail-index.netbsd.org/source-changes/1995/12/09/msg012096.html http://mail-index.netbsd.org/source-changes/1995/12/09/msg012098.html -uwe
Re: Help needed with understanding of config(1) debug output
On Thu, Sep 27, 2018 at 16:20:50 +0800, Paul Goyette wrote: > I've got a problem where something I've changed over the last six months > (or more) on the [pgoyette-compat] branch has broken the release build > for at least ``build.sh -m algor'' port. For some unknown reason it is > defining COMPAT_NETBSD32 in opt_compat_netbsd32.h even though the option > is not selected in the kernel definition file. > > I've tried to understand the debug output from ``config -d ...'' but > I simply don't understand the output. (The output looks more like it is > intended to debug config(1) itself, and not for debugging issues with > config's input files.) I find the following snippet in the debug output > >dependopts:326: debug: depend attr `COMPAT_NETBSD32' >dependopts:326: debug: option selected `compat_netbsd32' >dependopts:326: debug: depend `COMPAT_NETBSD32' searched > > This seems to indicate that attribute COMPAT_NETBSD32 was previously > "needed" and therefore we need to include option `compat_netbsd32'. But > there is no earlier mention of COMPAT_NETBSD32 in the debug output. You made EXEC_ELF32 depend on COMPAT_NETBSD32 and since you enable EXEC_ELF32, it pulls in COMPAT_NETBSD32 that it now depends on. -uwe
Re: How to prevent a mutex being _enter()ed from being _destroy()ed?
On Sat, Aug 11, 2018 at 00:46:26 +0700, Robert Elz wrote: > Date:Fri, 10 Aug 2018 08:03:55 -0400 > From:Greg Troxel > Message-ID: > > | Ancient BSD tradition is not to explain these things :-( > > Older than that. Don't you remember > you are not expected to understand this > (or wording very similar) in ancient 4th/5th edition unix. The explanation for that comment I've read somewhere was that it really meant "it will not be in the exam" (which is a wonderful story even if it's not true :). -uwe
Old FFS triggers assertion in BUFRD()
I have found OpenWindows Version 3 CD in a drawer. The label claims "ISO 9660 format", but it's really an FFS image. I was able to mount it with a little tweak - ffs_superblock_validate() checked only fs_size, but this CD from 1991 only has fs_old_size (fix committed). The next hiccup I ran into was an assertion in ufs_readwrite.c:172 (rump is compiled with DIAGNOSTIC). KASSERT(vp->v_type != VLNK || ump->um_maxsymlinklen != 0 || DIP(ip, blocks) == 0); It was triggered by di_blocks being 2. May be someone with enough FFS clue could take a look? Disc image available upon request. -uwe
Leaking kernel stack data in struct padding
On Wed, Jun 13, 2018 at 02:09:09 +, Valeriy E. Ushakov wrote: > Module Name: src > Committed By: uwe > Date: Wed Jun 13 02:09:09 UTC 2018 > > Modified Files: > src/sys/dev/wscons: wsevent.c > > Log Message: > wsevent_copyout_events50 - don't leak garbage from the kernel stack. > > On 64-bit machines struct timespec50 has padding between 32-bit tv_sec > and long tv_nsec that is not affected by normal assignment. Scrub it > before we uiomove struct owscons_event. I was looking at mouse events on an amd64 VM with # hexdump -e '/4 " %2d" /4 " %5d" /8 " %d" /8 ".%09d" "\n"' /dev/wsmouse note: wscons event sources give you compat event structs unless you request the current version with an ioctl (which is kinda hard to do in hexdump :). I noticed that the first reported event always had bogus timestamp. Took me a bit of time to realize what was going on. I fixed it in wsevent.c (indentation reduced for readability): +#if INT32_MAX < LONG_MAX /* scrub padding */ + memset(, 0, offsetof(struct timespec50, tv_nsec)); +#endif timespec_to_timespec50(>time, ); but I wonder if this scrubbing should be moved into timespec_to_timespec50() - after all the most likley use of the compat struct is to write or copyout it in the compat code, so the same problem probably happens elsewhere. On amd64 the compiler is smart enough to convert memset() to a few movq's. The compiler is not smart enough to notice that tv_nsec is written to in timespec_to_timespec50(), so memset(, 0, sizeof(ev50.time)); timespec_to_timespec50(...); would still emit two movq's immediately followed by another movq to tv_nsec. Hence this specific arguments in the call to memset(). Comments? PS: The next logical question is if there's a tool that can help audit the rest of the kernel for problems like that. :) -uwe
Re: I would like to contribute to NetBSD
On Sat, Apr 07, 2018 at 18:43:29 -0700, Andy Ruhl wrote: > On Fri, Apr 6, 2018 at 7:31 AM, Narendra Kangralkar >wrote: > > Hello All, > > > > I found that NetBSD a supported guest OS under VirtualBox project is > > partially completed. I would like to work on this if this project is still > > available. Please let me know your thoughts regarding this. > > I use NetBSD under Virtualbox so I'm guessing you're talking about > making a supported set of guest additions? Support for NetBSD Guest Additions has been committed to the VirtualBox tree quite a while ago. Though there's still no pkgsrc package. -uwe
Re: setting DDB_COMMANDONENTER="bt" by default
On Sat, Feb 17, 2018 at 08:35:32 +1100, matthew green wrote: > Valery Ushakov writes: > > On Thu, Feb 15, 2018 at 01:19:31 +, Sevan Janiyan wrote: > > > > > > I might/would suggest > > > > > > > >OPTIONS DDB_ONPANIC=2 > > > > > > clear, any reason not to have this as a default? (I'm going to sleep on > > > it) > > > > As someone has already mentioned upthread, because printing a > > backtrace might cause another panic, so the default was selected to be > > on the safe(r) side. At least that's what I recall. > > i don't think this is the case. > > the builtin stack trace code is fault-tolerant. if it > faults, it will not re-try and you'll get a db> prompt. My memory is hazy. I do have (for more than a decade it seems) a local change in db_trap() that adds db_recover around db_print_loc_and_inst() call, but I think that was to protect from fat fingers in ddb (hpcsh keyboard is tiny :). -uwe
Re: setting DDB_COMMANDONENTER="bt" by default
On Thu, Feb 15, 2018 at 02:11:07 +, Sevan Janiyan wrote: > On 02/15/18 01:23, Valery Ushakov wrote: > > As someone has already mentioned upthread, because printing a > > backtrace might cause another panic, so the default was selected to be > > on the safe(r) side. At least that's what I recall. > > On 02/15/18 01:33, Paul Goyette wrote: > > Yes, that matches my recall as well. > > Ah, ok, so leave this to rest? (is it worth testing in -current to see > how things go?) Well, "testing" here would be to throw random garbage in the stack for "bt" to choke on (and that garbage might also need to point to just the right other data). You might be able to script this with something like vbox snapshots I guess, by snapshotting a VM when it's in ddb and then fuzzing the kernel stack before resuming it (I don't remember if vbox vm debugger is scriptable, you might also need to hack it a bit to be). -uwe
Re: setting DDB_COMMANDONENTER="bt" by default
On Thu, Feb 15, 2018 at 01:19:31 +, Sevan Janiyan wrote: > > I might/would suggest > > > >OPTIONS DDB_ONPANIC=2 > > clear, any reason not to have this as a default? (I'm going to sleep on it) As someone has already mentioned upthread, because printing a backtrace might cause another panic, so the default was selected to be on the safe(r) side. At least that's what I recall. -uwe
Re: gcc: optimizations, and stack traces
[Summoning Krister] On Fri, Feb 09, 2018 at 11:23:17 +0100, Maxime Villard wrote: > There are also several cases where functions in the call tree can disappear > from the backtrace. In the following call tree: > > A -> B -> C -> D (and D panics) > > if, in B, GCC put the two instructions after the instruction that calls C, > the backtrace will be: > > A -> C -> D > > This can make a bug completely undebuggable. Does gcc actually generates code like that? I thought that it can delay frame pointer creation, but only until it needs to make a nested call, to C in your example, (as in the sample I showed in another mail to this thread). -uwe
Re: gcc: optimizations, and stack traces
On Fri, Feb 09, 2018 at 11:38:47 +0100, Martin Husemann wrote: > On Fri, Feb 09, 2018 at 11:23:17AM +0100, Maxime Villard wrote: > > > When I spotted this several months ago (while developing Live > > Kernel ASLR), I tried to look for GCC options that say "optimize > > with -O2, but keep the stack trace intact". I couldn't find one, > > and the only thing I ended up doing was disabling -O2 in the > > makefiles. > > -fno-omit-frame-pointer? That won't help. `-O' also turns on `-fomit-frame-pointer' on machines where doing so does not interfere with debugging. so it's not turned off in the first place. The problem is that some of the later optimization passes may push frame pointer setup to some place later in function. E.g. on -7 void kernfs_get_rrootdev(void) { static int tried = 0; if (tried) { /* Already did it once. */ return; } tried = 1; if (rootdev == NODEV) return; rrootdev = devsw_blk2chr(rootdev); if (rrootdev != NODEV) return; rrootdev = NODEV; printf("kernfs_get_rrootdev: no raw root device\n"); } is compiled to c068f81b : c068f81b: mov0xc0fc6b40,%eax c068f820: test %eax,%eax c068f822: jnec068f867c068f824: movl $0x1,0xc0fc6b40 c068f82e: mov0xc0fde0b8,%edx c068f834: mov0xc0fde0bc,%eax c068f839: mov%edx,%ecx c068f83b: and%eax,%ecx c068f83d: cmp$0x,%ecx c068f840: je c068f867 -> c068f842: push %ebp -> c068f843: mov%esp,%ebp c068f845: sub$0x8,%esp c068f848: mov%edx,(%esp) c068f84b: mov%eax,0x4(%esp) c068f84f: call c091ce52 So the "tried" check and the first "rootdev" check happen before the frame pointer is set up. -uwe
Re: Proposal to obsolete SYS_pipe
On Tue, Dec 26, 2017 at 01:29:42 +, Christos Zoulas wrote: > In article, > Kamil Rytarowski wrote: > >-=-=-=-=-=- > >-=-=-=-=-=- > > > >On 25.12.2017 17:43, Christos Zoulas wrote: > >> On Dec 25, 4:42pm, n...@gmx.com (Kamil Rytarowski) wrote: > >> -- Subject: Re: Proposal to obsolete SYS_pipe > >> > >> | I've extracted two changes from the original mail: > >> | > >> | https://mail-index.netbsd.org/tech-kern/2017/12/25/msg022836.html > >> > >> Yes, the first patch is exactly what I had in mind; remove the > >> assembly stubs from libc and make pipe() a wrapper for pipe2(). > >> The second patch sounds good too, but it is not in the email... > >> > >> christos > >> > > > >I've included the missing patch in the subsequent mail: > > > >https://mail-index.netbsd.org/tech-kern/2017/12/25/msg022840.html > > > >Patch (pasted here for the reference): > > > >http://netbsd.org/~kamil/patch-00041-refactor-pipe1.txt > > I am good with both since they eliminate the MD code and simplify > the MI code. The only drawback is that sys_pipe (the system call) > is not handled directly anymore by libc, but that's not an issue > except for the slight performance loss (which does not really matter > the moment you start doing I/O). Why can't we just leave pipe() alone? There are other syscalls that return two values, e.g. fork. The MD asm stubs are trivial and they are already written. They've been there for ages. Why the sudden desire to "create movment"? The pipe1() change is a good thing, OTOH. -uwe
Re: Proposal to obsolete SYS_pipe
On Mon, Dec 25, 2017 at 16:37:43 +0100, Kamil Rytarowski wrote: > On 24.12.2017 22:25, Kamil Rytarowski wrote: > > > http://netbsd.org/~kamil/patch-00039-obsolete-SYS_pipe.txt > > I've extracted two patches from the above proposal. > > In these patches SYS_pipe is not marked COMPAT_80 and not removed from > rump. I've left it as it is. > > 1. Implement pipe() with pipe2(2) in libc: > > New source code is now Machine Independent. > > http://netbsd.org/~kamil/patch-00040-implement-pipe-with-pipe2-in-libc.txt > > The generated code in libc for x86_64 is also simpler and shorter: > > 0008b2a2 <_pipe>: >8b2a2: 31 f6 xor%esi,%esi >8b2a4: e9 b7 f5 fa ff jmpq 3a860But you incur the price of pipe2's copyout(). I'm curious, does anyone know how things like SMAP contribute to that price? > 2. Refactor pipe1() kernel-internal function to operate over int[2] > rather than register_t[2]. Stop returning garbage through retval[2] > from pipe2(2). Please, can you be more specific with your characterizations. "Returning garbage" is vague, and without further details (that you do know yourself but don't disclose) makes every reader expend time and mental effort to figure out what are you really talking about. For the reference, sys_pipe2() overwrites retval[1] with the second descriptor b/c it passes retval[] to pipe1(), like sys_pipe() does. But what is the intended effect for pipe() causes retval[1] register to be clobbered for pipe2(). -uwe
config vs. modules iconf files
[torn off of the original thread] On Fri, Dec 08, 2017 at 04:40:01 +0300, Valery Ushakov wrote: > Date: Fri, 8 Dec 2017 04:40:01 +0300 > From: Valery Ushakov <u...@stderr.spb.ru> > Subject: Re: Attaching to an attribute > To: tech-kern@netbsd.org > Mail-Followup-To: tech-kern@netbsd.org > > On Fri, Dec 08, 2017 at 04:29:49 +0300, Valery Ushakov wrote: > > > On Thu, Dec 07, 2017 at 23:07:47 +0300, Valery Ushakov wrote: > > > > > However config(1) instead of providing single wildcard parent spec > > > seems to instantiate parent specs for all parents it's seen that carry > > > the attribute. > > > > Bah, my emacs has too many buffers. Apparently I was looking at the > > kernel config from a different architecture. > > > > Astonishingly, i386 and amd64 GENERIC do _not_ have > > > > wsmouse* at wsmousedev? > > > > wildcard attachment and instead use separate attachments for each > > parent. I'm overcome with nostalgy, but this probably should be > > fixed, it's not 1990s anymore. > > This, however, still highlights a problem. How can a module device > driver attach wsmouse as a child regardless of how the kernel is > configured. I have filed http://gnats.netbsd.org/52821 for this so that it's not lost in the proverbial cracks. Since most people don't read all of netbsd-bugs@ I'm also duplicating it here. Separately, so that the PR is not spammed with every reply (should there be any :). 8<8< config(8) supports generating autoconf glue for modules with (still undocumented!) "ioconf" keyword. Multiple examples can be found under sys/modules. Unfortunately in certain circumstances it generates ioconf.c structures that are not directly usable. Consider the ioconf file for VirtualBox Guest Addtions driver: ioconf vboxguest include "conf/files" include "dev/i2o/files.i2o" # XXX: pci needs device iop include "dev/pci/files.pci" device vboxguest: wsmousedev attach vboxguest at pci pseudo-root pci* vboxguest0 at pci? dev ? function ? wsmouse*at vboxguest? wsmouse(4) attachment is necessary here because generally speaking we cannot rely on the kernel that loads the module to have wsmouse*at wsmousedev? and in fact until very recently i386 and amd64 kernels didn't, they only had attachments to specific parents. Unfortunately config(8) is overzealous and seeing that wsmouse attachment causes it to emit CFDRIVER_DECL(wsmouse, ...) and it also includes wsmouse into cfdriver_ioconf_vboxguest[] and cfattach_ioconf_vboxguest[] arrays that are to be passed to config_init_component(9). That obviously causes the modload to fail as the wsmouse driver is already registered with autoconf. My guess is that config(8) emits these because it sees the attachments. This probably made sense for the in-tree modules, where the actual "device" command comes from the relevnat "files.*" file, so the only way for config to infer what to emit is to look at the attachments. Also all in-tree modules only ever attach single driver, so they never run into this problem with config (though I think uatp module should fail to attach wsmouse when loaded). We need a way to tell config which definitions it should emit. Just off the top of my head, may be can just mark the attachments, e.g: module vboxguest* at pci? dev ? function ? or even module vboxguest where config can see that vboxguest has single possible parent and infer the wildcard attachment. While here, it can also infer necessary pseudo-root so that the user doesn't have to specifiy it. -uwe
Re: Attaching to an attribute
On Fri, Dec 08, 2017 at 04:29:49 +0300, Valery Ushakov wrote: > On Thu, Dec 07, 2017 at 23:07:47 +0300, Valery Ushakov wrote: > > > However config(1) instead of providing single wildcard parent spec > > seems to instantiate parent specs for all parents it's seen that carry > > the attribute. > > Bah, my emacs has too many buffers. Apparently I was looking at the > kernel config from a different architecture. > > Astonishingly, i386 and amd64 GENERIC do _not_ have > > wsmouse* at wsmousedev? > > wildcard attachment and instead use separate attachments for each > parent. I'm overcome with nostalgy, but this probably should be > fixed, it's not 1990s anymore. This, however, still highlights a problem. How can a module device driver attach wsmouse as a child regardless of how the kernel is configured. -uwe
Re: Attaching to an attribute
On Thu, Dec 07, 2017 at 23:07:47 +0300, Valery Ushakov wrote: > However config(1) instead of providing single wildcard parent spec > seems to instantiate parent specs for all parents it's seen that carry > the attribute. Bah, my emacs has too many buffers. Apparently I was looking at the kernel config from a different architecture. Astonishingly, i386 and amd64 GENERIC do _not_ have wsmouse* at wsmousedev? wildcard attachment and instead use separate attachments for each parent. I'm overcome with nostalgy, but this probably should be fixed, it's not 1990s anymore. -uwe
Attaching to an attribute
Devices can be attached to an attribute, e.g. wsmouse* at wsmousedev? where potential parents declare to have that attribute, e.g. device ums: hid, wsmousedev and the autoconf code knows how to attach to the attribute only: static int cfparent_match(const device_t parent, const struct cfparent *cfp) { /* ... */ /* * If no specific parent device instance was specified (i.e. * we're attaching to the attribute only), we're done! */ if (cfp->cfp_parent == NULL) return 1; /* * Check the parent device's name. */ if (STREQ(pcd->cd_name, cfp->cfp_parent) == 0) return 0;/* not the same parent */ /* * Make sure the unit number matches. */ if (cfp->cfp_unit == DVUNIT_ANY || /* wildcard */ cfp->cfp_unit == parent->dv_unit) return 1; /* Unit numbers don't match. */ return 0; } However config(1) instead of providing single wildcard parent spec seems to instantiate parent specs for all parents it's seen that carry the attribute. Check ioconf.c of your kernel: instead of single static const struct cfparent pspecXXX = { "wsmousedev", NULL, DVUNIT_ANY }; struct cfdata cfdata[] = { ... { "wsmouse", "wsmouse", 0, STAR, loc+XXX, 0, }, ... }; it emits static const struct cfparent pspec15 = { "wsmousedev", "spic", DVUNIT_ANY }; /* ... */ /*238: wsmouse* at spic? mux 0 */ { "wsmouse","wsmouse", 0, STAR, loc+1423, 0, }, /*239: wsmouse* at pms? mux 0 */ { "wsmouse","wsmouse", 0, STAR, loc+1424, 0, }, /*240: wsmouse* at ums? mux 0 */ { "wsmouse","wsmouse", 0, STAR, loc+1425, 0, }, /* ... */ for each device with wsmousedev attribute. This wastes a bit of memory in the static config, but that's not much of a problem. However if you want to attach such device to an attribute on another device you load as a module, you can't, at least naively, b/c there's no wildcard pspec for the wsmouse. In existing code only uatp(4) module attaches wsmouse(4). I don't have one, but my prediction is that it will fail with "device not configured". Can someone with the device try and verify that? You can add wsmouse* at wsmousedev? to the module's ioconf. Surprisingly, that generates wildcard parent spec for wsmouse! But it also adds wsmouse to cfdriver and cfattach arrays and loading the module will fail with EEXIST. The workaround seems to be to manually hack the ioconf.c so that the module has the wildcard pspec line for wsmouse in cfdata only. Anyone with enough config clue to comment (or better yet, fix :)? -uwe
Re: amd64: kernel aslr support
On Sat, Oct 07, 2017 at 20:42:58 +0200, Maxime Villard wrote: > Le 04/10/2017 ? 21:00, Maxime Villard a ?crit : > > Here is a Kernel ASLR implementation for NetBSD-amd64. > > [...] > > Known issues: > > [...] > > * There are several redefinitions in the prekern headers. The way to remove > >them depends on where we put the prekern in the source tree. > > Does someone have a preference on where to put the prekern? I guess I'll > put it in src/sys/arch/amd64/prekern/. I'd say src/sys/arch/amd64/stand/prekern to conform to existing practice. -uwe
Re: Patching wscons_keydesc at runtime
On Fri, Aug 04, 2017 at 02:46:15 +0300, Valery Ushakov wrote: > Date: Fri, 4 Aug 2017 02:46:15 +0300 > From: Valery Ushakov <u...@stderr.spb.ru> > Subject: Re: Patching wscons_keydesc at runtime > To: tech-kern@netbsd.org > Mail-Followup-To: tech-kern@netbsd.org > > On Fri, Aug 04, 2017 at 01:38:38 +0200, Emmanuel Dreyfus wrote: > > > Emmanuel Dreyfus <m...@netbsd.org> wrote: > > > > > > Unfortunately this breaks hpcsh which initializes console very early > > > > when malloc is not available, so when you boot with wscons the machine > > > > wedges. > > > > > > > > I think your change should be reverted for now and a different fix > > > > developed. > > > > > > Or perhaps it could be just ifdef hpcarm? > > > > What about this change? > > > > Index: sys/dev/hpc/hpckbd.c > > === > > RCS file: /cvsroot/src/sys/dev/hpc/hpckbd.c,v > > retrieving revision 1.31 > > diff -U4 -r1.31 hpckbd.c > > --- sys/dev/hpc/hpckbd.c12 Jun 2017 09:23:39 - 1.31 > > +++ sys/dev/hpc/hpckbd.c3 Aug 2017 23:36:47 - > > @@ -265,15 +265,17 @@ > > const keysym_t *map, int mapsize) > > { > > int i; > > const struct wscons_keydesc *desc; > > +#ifdef hpcarm > > static struct wscons_keydesc *ndesc = NULL; > > > > /* > > * fix keydesc table. Since it is const data, we must > > -* copy it once before changingg it. > > +* copy it once before changingg it. That does not work > > +* on hpcsh which initialize console before malloc is > > +* available. > > */ > > - > > if (ndesc == NULL) { > > size_t sz; > > > > for (sz = 0; hpckbd_keymapdata.keydesc[sz].name != 0; sz++); > > @@ -282,14 +284,15 @@ > > memcpy(ndesc, hpckbd_keymapdata.keydesc, sz * > > sizeof(*ndesc)); > > > > hpckbd_keymapdata.keydesc = ndesc; > > } > > +#endif /* hpcarm */ > > > > desc = hpckbd_keymapdata.keydesc; > > for (i = 0; desc[i].name != 0; i++) { > > if ((desc[i].name & KB_MACHDEP) && desc[i].map == NULL) { > > - ndesc[i].map = map; > > - ndesc[i].map_size = mapsize; > > + desc[i].map = map; > > + desc[i].map_size = mapsize; > > } > > } > > > > return; > > I think it might be better to just have two copies of the function, > old and new. E.g. this patch doesn't restore the unconst hack. > > PS: Also "changingg" has a typo. Looking closer (it's been a while since I touched low-level sh3 stuff), I think I'll just drop the early consinit() call that hpcsh does and let main() do it. That avoids the ugly special case. -uwe
Re: Patching wscons_keydesc at runtime
On Fri, Aug 04, 2017 at 01:38:38 +0200, Emmanuel Dreyfus wrote: > Emmanuel Dreyfuswrote: > > > > Unfortunately this breaks hpcsh which initializes console very early > > > when malloc is not available, so when you boot with wscons the machine > > > wedges. > > > > > > I think your change should be reverted for now and a different fix > > > developed. > > > > Or perhaps it could be just ifdef hpcarm? > > What about this change? > > Index: sys/dev/hpc/hpckbd.c > === > RCS file: /cvsroot/src/sys/dev/hpc/hpckbd.c,v > retrieving revision 1.31 > diff -U4 -r1.31 hpckbd.c > --- sys/dev/hpc/hpckbd.c12 Jun 2017 09:23:39 - 1.31 > +++ sys/dev/hpc/hpckbd.c3 Aug 2017 23:36:47 - > @@ -265,15 +265,17 @@ > const keysym_t *map, int mapsize) > { > int i; > const struct wscons_keydesc *desc; > +#ifdef hpcarm > static struct wscons_keydesc *ndesc = NULL; > > /* > * fix keydesc table. Since it is const data, we must > -* copy it once before changingg it. > +* copy it once before changingg it. That does not work > +* on hpcsh which initialize console before malloc is > +* available. > */ > - > if (ndesc == NULL) { > size_t sz; > > for (sz = 0; hpckbd_keymapdata.keydesc[sz].name != 0; sz++); > @@ -282,14 +284,15 @@ > memcpy(ndesc, hpckbd_keymapdata.keydesc, sz * sizeof(*ndesc)); > > hpckbd_keymapdata.keydesc = ndesc; > } > +#endif /* hpcarm */ > > desc = hpckbd_keymapdata.keydesc; > for (i = 0; desc[i].name != 0; i++) { > if ((desc[i].name & KB_MACHDEP) && desc[i].map == NULL) { > - ndesc[i].map = map; > - ndesc[i].map_size = mapsize; > + desc[i].map = map; > + desc[i].map_size = mapsize; > } > } > > return; I think it might be better to just have two copies of the function, old and new. E.g. this patch doesn't restore the unconst hack. PS: Also "changingg" has a typo. -uwe
Re: Patching wscons_keydesc at runtime
On Fri, Aug 04, 2017 at 01:30:14 +0200, Emmanuel Dreyfus wrote: > Valery Ushakov <u...@stderr.spb.ru> wrote: > > > Unfortunately this breaks hpcsh which initializes console very early > > when malloc is not available, so when you boot with wscons the machine > > wedges. > > > > I think your change should be reverted for now and a different fix > > developed. > > Or perhaps it could be just ifdef hpcarm? That will also do for now. Please, don't forget to request pullups. Please, can you also file a PR with the details on what is broken in the original "unconst" version. Do you need to boot with some specific selection in hpcboot, what are the commands, what you expect to happen and what actually happens, etc. I'll try to look into it, but probably not immediately. As I said in an earlier reply, unfortunately layout handling is a mess, b/c the data structure definitions contradict the intended purpose of machdep entries, so some rototill might be necessary. Thanks. -uwe
Re: Patching wscons_keydesc at runtime
On Sat, Jun 10, 2017 at 05:18:16 +0200, Emmanuel Dreyfus wrote: > I managed to restore wscons keymaps by copying > hpckbd_keymapdata.keydesc into a malloc() buffer and changing the > hpckbd_keymapdata.keydesc to the new location, which is mapped > read/write. Unfortunately this breaks hpcsh which initializes console very early when malloc is not available, so when you boot with wscons the machine wedges. I think your change should be reverted for now and a different fix developed. -uwe
Wskbd constness (Was: Patching wscons_keydesc at runtime)
On Sat, Jun 10, 2017 at 05:18:16 +0200, Emmanuel Dreyfus wrote: > I just upgraded an HP Jornada 720 from NetBSD 2.0 to NetBSD 7.1, and > discovered the wscons keymaps were broken in the meantime: it is impossible to > change the keymap using wsconsctl encoding or wsconsctl map. Both commands > succeed but have no effect. > > After poking a few printf in the kernel, I found this in > src/sys/dev/hpc/hpckbd.c: > > /* fix keydesc table */ > /* > * XXX The way this is done is really wrong. The __UNCONST() > * is a hint as to what is wrong. This actually ends up modifying > * initialized data which is marked "const". > * The reason we get away with it here is apparently that text > * and read-only data gets mapped read/write on the platforms > * using this code. > */ > desc = (struct wscons_keydesc *)__UNCONST(hpckbd_keymapdata.keydesc); > for (i = 0; desc[i].name != 0; i++) { > if ((desc[i].name & KB_MACHDEP) && desc[i].map == NULL) { > desc[i].map = map; > desc[i].map_size = mapsize; > } > } > > I managed to restore wscons keymaps by copying hpckbd_keymapdata.keydesc into > a malloc() buffer and changing the hpckbd_keymapdata.keydesc to the new > location, which is mapped read/write. > > The offending code did not change since NetBSD 2.0, except the XXX comment > added in 2015. That suggests the compiler behavior changed about initalized > const data, which was still mapped R/W in the ancient time and is now really > read-only, altough it accepts nilpotent writes without raising an exception. The constness in the MI wskbd code looks wrong: /* KBD_NULLMAP generates a entry for machine native variant. the entry will be modified by machine dependent keyboard driver. */ #define KBD_NULLMAP() ... const struct wscons_keydesc pckbd_keydesctab[] = { ... /* placeholders */ KBD_NULLMAP(KB_US | KB_MACHDEP, KB_US), ... }; Which is obviously self-contradictory. This is probably b/c induced by const in: struct wskbd_mapdata { const struct wscons_keydesc *keydesc; kbd_t layout; }; > + for (sz = 0; hpckbd_keymapdata.keydesc[sz].name != 0; sz++); /usr/share/misc/style requires explicit no-op "continue" here. -uwe
Re: Adding ruminit(4)
On Wed, May 24, 2017 at 17:20:52 +, Christos Zoulas wrote: > Why not move the all the code into a single "ubulkdisable" or > something driver? Finally a thumb to use in-kernel Lua on? :) -uwe
Re: Cnmagic support for wscons
On Tue, Jan 17, 2017 at 11:55:52 +1100, Nathanial Sloss wrote: > On Tue, 17 Jan 2017 07:32:34 Valery Ushakov wrote: > > On Tue, Jan 17, 2017 at 04:26:48 +1100, Nathanial Sloss wrote: > > > On Mon, 16 Jan 2017 00:44:02 Valery Ushakov wrote: > > > > On Sun, Jan 15, 2017 at 13:30:15 +0100, Martin Husemann wrote: > > > > > On Sun, Jan 15, 2017 at 01:59:06PM +1100, Nathanial Sloss wrote: > > > > > > Mapping KS_Cmd_Debugger would also work but I'm unsure as to how > > > > > > to do this without using wskbd key sequences in the magic. > > > > > > > > > > I don't understand - if you just assing KS_Cmd_Debugger somewhere, > > > > > why would you need cnmagic? > > > > > > > > Exactly. You need cnmagic(9) for detecting debugger _sequence_ > > > > in-band like in serial console. > > > > > > > > Your patch doesn't provide any documentation or an accompaning > > > > description, so I'm not sure what exactly it does, e.g. what should > > > > cnmagic value look like for wskbd? > > > > > > I've put an example in the updated man page for both a wskbd command > > > and regular text. > > > > > | +sysctl variable must be prefixed by \\001 for normal characters and/or > > | +control codes. > > | +Alternatively it can be prefixed by \\002 for wskbd commands. > > > > [...] > > > > | +The default cnmagic is \\002\\040\\0364 which on most platforms is > > | +-- > > | +the wskbd command for > > | +.Xr ddb 4 . > > > > - How does \040\0364 (0x20 0xf4) correspond to ? > > > > - Does the fact that KS_Cmd_Debugger is defined as 0xf420 have > > anything to do with it? > > Yes it does. > > > - If so, why the byte order is little endian? > > Endianness could be an issue I'll test in a sparc emulator. > > > - KS_Cmd_Debugger becomes a meaningless placeholder - you will have it > > mapped by default but you can set cnmagic to something else > > > > > > But in general it's not even > > > > entirely clear to me what a semantic of cnmagic(9) for wskbd could be. > > > > Should it use individual key-presseses as the basic input stream it > > > > parses? If yes, you will lose the ability to use, e.g., *both* > > > > C-A-Esc and A-C-Esc _chords_ to break into debuger, because with > > > > wskbd(4) keyboard mapping they are the same _chords_, but with > > > > cnmagic(9) they are different _sequences_. > > > > > > That's why I have two different prefixes one for wskbd commands and > > > another for regular text. > > > > - What if one wants to use a mixture of them? > > I can make it possibile to specify keycodes as well. So if a mixture > was wanted one would have to use that. > > > - What happens if you specify in cnmaigc the sequence of individual > > keys that map (at wskbd level) to a KS_Cmd_*? > > It would jump into ddb and on return run the KS_Cmd_*. I would expect the proposal to actually describe and document all of that and more. Do we really need to play the game where I have to ask very specific direct questions that you try to answer as literally as possible? > > > > I'd also say that the very fact that the patch doesn't use > > > > cn_check_magic(9) indicates in some sense that it probably does not > > > > implement "cnmagic support for wscons". :) > > > > > > Please see: > > > > > > ftp://ftp.netbsd.org/pub/NetBSD/misc/nat/cnmagic.v2.diff > > > > > > It now uses cn_check magic instead of the custom ws_check_magic which was > > > based on cn_check_magic. > > > > I still intensly dislike the very idea behind this patch. I don't > > think it's meaningful or useful to bolt cnmagic onto wskbd. Yes, I > > can theoretically imagine the possibility that someone somewhere might > > need a wskbd sequence to break into the debugger that for some reason > > cannot be expressed with just mapping KS_Cmd_Debugger. I'd estimate > > the probablity of that be to just slighly more than that someone > > somewhere will find it useful to be able to express morse code support > > in wskbd mappings :). > > > > You PR states the motivation as: > > | wscons does not support cn_magic - setting hw.cnmagic has no effect. > > > > So just drop hw.cnmagic sysctl node if the console doesn't support it. > > There are three specific cases I have experienced. > > An Older hp microserver with remote access c
Re: Cnmagic support for wscons
On Tue, Jan 17, 2017 at 04:26:48 +1100, Nathanial Sloss wrote: > On Mon, 16 Jan 2017 00:44:02 Valery Ushakov wrote: > > On Sun, Jan 15, 2017 at 13:30:15 +0100, Martin Husemann wrote: > > > On Sun, Jan 15, 2017 at 01:59:06PM +1100, Nathanial Sloss wrote: > > > > Mapping KS_Cmd_Debugger would also work but I'm unsure as to how > > > > to do this without using wskbd key sequences in the magic. > > > > > > I don't understand - if you just assing KS_Cmd_Debugger somewhere, > > > why would you need cnmagic? > > > > Exactly. You need cnmagic(9) for detecting debugger _sequence_ > > in-band like in serial console. > > > > Your patch doesn't provide any documentation or an accompaning > > description, so I'm not sure what exactly it does, e.g. what should > > cnmagic value look like for wskbd? > > I've put an example in the updated man page for both a wskbd command > and regular text. | +sysctl variable must be prefixed by \\001 for normal characters and/or | +control codes. | +Alternatively it can be prefixed by \\002 for wskbd commands. [...] | +The default cnmagic is \\002\\040\\0364 which on most platforms is | +-- | +the wskbd command for | +.Xr ddb 4 . - How does \040\0364 (0x20 0xf4) correspond to ? - Does the fact that KS_Cmd_Debugger is defined as 0xf420 have anything to do with it? - If so, why the byte order is little endian? - KS_Cmd_Debugger becomes a meaningless placeholder - you will have it mapped by default but you can set cnmagic to something else > > But in general it's not even > > entirely clear to me what a semantic of cnmagic(9) for wskbd could be. > > Should it use individual key-presseses as the basic input stream it > > parses? If yes, you will lose the ability to use, e.g., *both* > > C-A-Esc and A-C-Esc _chords_ to break into debuger, because with > > wskbd(4) keyboard mapping they are the same _chords_, but with > > cnmagic(9) they are different _sequences_. > > That's why I have two different prefixes one for wskbd commands and > another for regular text. - What if one wants to use a mixture of them? - What happens if you specify in cnmaigc the sequence of individual keys that map (at wskbd level) to a KS_Cmd_*? > > I'd also say that the very fact that the patch doesn't use > > cn_check_magic(9) indicates in some sense that it probably does not > > implement "cnmagic support for wscons". :) > > Please see: > > ftp://ftp.netbsd.org/pub/NetBSD/misc/nat/cnmagic.v2.diff > > It now uses cn_check magic instead of the custom ws_check_magic which was > based on cn_check_magic. I still intensly dislike the very idea behind this patch. I don't think it's meaningful or useful to bolt cnmagic onto wskbd. Yes, I can theoretically imagine the possibility that someone somewhere might need a wskbd sequence to break into the debugger that for some reason cannot be expressed with just mapping KS_Cmd_Debugger. I'd estimate the probablity of that be to just slighly more than that someone somewhere will find it useful to be able to express morse code support in wskbd mappings :). You PR states the motivation as: | wscons does not support cn_magic - setting hw.cnmagic has no effect. So just drop hw.cnmagic sysctl node if the console doesn't support it. -uwe
Re: Cnmagic support for wscons
On Sun, Jan 15, 2017 at 13:30:15 +0100, Martin Husemann wrote: > On Sun, Jan 15, 2017 at 01:59:06PM +1100, Nathanial Sloss wrote: > > > Mapping KS_Cmd_Debugger would also work but I'm unsure as to how > > to do this without using wskbd key sequences in the magic. > > I don't understand - if you just assing KS_Cmd_Debugger somewhere, > why would you need cnmagic? Exactly. You need cnmagic(9) for detecting debugger _sequence_ in-band like in serial console. Your patch doesn't provide any documentation or an accompaning description, so I'm not sure what exactly it does, e.g. what should cnmagic value look like for wskbd? But in general it's not even entirely clear to me what a semantic of cnmagic(9) for wskbd could be. Should it use individual key-presseses as the basic input stream it parses? If yes, you will lose the ability to use, e.g., *both* C-A-Esc and A-C-Esc _chords_ to break into debuger, because with wskbd(4) keyboard mapping they are the same _chords_, but with cnmagic(9) they are different _sequences_. I'd also say that the very fact that the patch doesn't use cn_check_magic(9) indicates in some sense that it probably does not implement "cnmagic support for wscons". :) -uwe
Re: Cnmagic support for wscons
On Sun, Jan 15, 2017 at 09:08:43 +1100, Nathanial Sloss wrote: > Please see: > > ftp://ftp.netbsd.org/pub/NetBSD/misc/nat/wscons.cnmagic.diff > > Also the original PR: > > http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=48360 > > The diff adds cnmagic support for wscons consoles. > > If there are no objections, I'd like to commit this within by the > 20th of January. The PR doesn't really says why is this necessary? What's wrong with just mapping KS_Cmd_Debugger? -uwe
Re: ptrace(2) interface for hardware watchpoints (breakpoints)
On Thu, Dec 15, 2016 at 19:51:35 +0100, Kamil Rytarowski wrote: > On 15.12.2016 16:42, Valery Ushakov wrote: > > Again, you don't provide any details. What extra logic? Also, what > > are these few dozens of instructions you are talking about? I.e. what > > is that extra work you have to do for a process-wide watchpoint that > > you don't have to do for an lwp-specific watchpoint on each return to > > userland? > > 1. Complexity is adding extra case in ptrace_watchpoint structure, > adding there a way to specify per-thread or per-process. If there > someone wants to set per-thread watchpoints inside the process > structure.. there would be need to have a list of available watchpoints, > that would scale to number of watchpoints possible x number of threads list. > > 2. Complexity on returning to userland - need to lock structure process > in userret(9) and check every watchpoint if it's process-wide or > dedicated for the thread. Why would you need all this? Consider the case when debug registers are part of the mcontext, then the very act of restoring the context enables corresponding watchpoints for the lwp. When the debug registers are not part of mcontext the only difference is that after restoring the mcontext you also set debug registers from some other structure. E.g. sh3 uses User Break Controller to implement single-stepping, so effectively a kind of watchpoint that is triggered after instruction, not matching any address bits, asid, etc, etc. The register in UBC that enables the watchpoint is set from a field in trapframe, just like any other register. So at ptrace(2) time to set a process-wide watchpoint, you go over all existing lwps and setup their trapframes accordingly. For new lwps created after the watchpoint is set you need to do that at lwp creation time. But when lwp returns to userland, there's no overhead. > I implemented it originally per process and I finally decided to throw > the per-process vs per-thread logic away, out of the kernel and expose > watchpoints (or technically bitmasks of available debug registers) to > userland. > > It's easier to check perlwp local structure and end up with up to 4 > fields there, than lock a list and iterate over N elements. Every thread > has also dedicated bit in its property indicating whether it has > attached watchpoints. > > From user-land point of view, and management it's equivalent. With the > difference that debugger needs to catch thread creation and apply > desired watchpoint to it. > > Why bitmasks and not raw registers? On some level there is need to check > if the composed combination is valid in the kernel - dividing > user-settable bits from registers to bitmask is needed on some level > anyway, and while it's possible to be done in kernel, why not to export > it to userland? > > I've found it easier to be reused in 3rd party software. -uwe
Re: ptrace(2) interface for hardware watchpoints (breakpoints)
On Tue, Dec 13, 2016 at 18:16:04 +0100, Kamil Rytarowski wrote: > >> 4. Do not set watchpoints globally per process, limit them to > >> threads (LWP). [...] Adding process-wide management in the > >> ptrace(2) interface calls adds extra complexity that should be > >> pushed away to user-land code in debuggers. > > > > I have no idea what amd64 debug registers do, but this smells like you > > are exposing in the MI interface some of those details. I don't think > > this can be done in hardware on sh3, e.g. Ok, I was confused there for a moment. The "debug state" is per-lwp and is restored when lwp is switched to. What was I thinking... > > Also, you quite often have no idea which thread stomps on your data, > > so I'd imagine most of the time you do want a global watchpoint. > > This is true. > > With the proposed interface per-thread a debugger can set the same > hardware watchpoint for each LWP and achieve the same result. There are > no performance or synchronization challenges as watchpoints can be set > only when a process is stopped. > > In my older code I had logic per-process to access watchpoints, but > it required extra logic in thread-specific functions to access > process specific data. I assumed that saving few dozens of CPU > cycles before each thread entering user-space is precious. (I know > it's a small optimization, however it's for free) Again, you don't provide any details. What extra logic? Also, what are these few dozens of instructions you are talking about? I.e. what is that extra work you have to do for a process-wide watchpoint that you don't have to do for an lwp-specific watchpoint on each return to userland? > >> 5. Do not allow to mix PT_STEP and hardware watchpoint, in case of > >> single-stepping the code, disable (it means: don't set) hardware > >> watchpoints for threads. Some platforms might implement single-step with > >> hardware watchpoints and managing both at the same time is generating > >> extra pointless complexity. > > > > I don't think I see how "extra pointless complexity" follows. > > 1. At least in MD x86 specific code, watchpoint traps triggered with > stepped code are reported differently to those reported with plain steps > and also differently to plain hardware watchpoint traps. They are 3rd > type of a trap. > > 2. Single stepping can be implemented with hardware assisted watchpoints > (technically breakpoints) on the kernel side in MD. And if so, trying to > apply watchpoints and singlestep will conflict and this will need > additional handling on the kernel side. > > To oppose extra complexity I propose to make stepping and watchpoints > separable, one or the other, but not both. And again you allude to MD details and don't provide any. You cannot just handwave this away. You will have to provide enough information for people to implement this for other arches evnentually, including MD specifics that affected the design, so that people can see how their MD specific details affect their implementation. Why don't provide this upfront? I understand you might be eager to commit this work and be done with it, but you are doing this fulltime. Others don't have this luxury. So I don't want to come around to implementing your desing in a few months time when I have some spare cycles and discover that it's ill suited for the hardware I have to deal with. May be you are right, and it's hard to mix single-stepping and watchpoints, but I don't have time to investigate this fully right now for sh3 and you don't provide any details that will back your conclusion for x86. Have it occured to you that you might me missing some approach to solving this, but people that grok x86 can't tell you unless they know the details. And I don't think that committing first, as you seem to have done already, and then let people figure it out from RTFS is an acceptable approach, b/c, again, without description you force people to RTFS and they might not have the time. > > Also, you might want both, single-stepping and waiting for a > > watchpoint. Will debugger have switch dynamically to software > > watchpoints when single-stepping? Can it even do that already? > > My understanding of stepping the code is that we want to go one and only > one instruction ahead (unless port restricts it and its 1 or more), > followed with a break. > > What's the use case of waiting for data access and stepping in the same > time? Is it needed? Does it solve some issues that cannot be solved > otherwise? Could it be implemented in software (in case of watch)? Isn't it your job to tell us the answers? So, let's say I set a watchpoint and then I hit some other breakpoint and do some stepi. If one of those instructions I'm stepping will do the read/write I'm watching for, how it will be detected y the debugger if you can't mix hw-assisted watchpoints and single-stepping? > My original intention was to make it friendly for ports, without too > specific
Re: ptrace(2) interface for hardware watchpoints (breakpoints)
On Tue, Dec 13, 2016 at 02:04:36 +0100, Kamil Rytarowski wrote: > The design is as follows: > > 1. Accessors through: > - PT_WRITE_WATCHPOINT - write new watchpoint's state (set, unset, ...), > - PT_READ_WATCHPOINT - read watchpoints's state, > - PT_COUNT_WATCHPOINT - receive the number of available watchpoints. Gdb supports hardware assisted watchpoints. That implies that other OSes have existing designs for them. Have you studied those existing designs? Why do you think they are not suitable to be copied? > 4. Do not set watchpoints globally per process, limit them to > threads (LWP). [...] Adding process-wide management in the > ptrace(2) interface calls adds extra complexity that should be > pushed away to user-land code in debuggers. I have no idea what amd64 debug registers do, but this smells like you are exposing in the MI interface some of those details. I don't think this can be done in hardware on sh3, e.g. Also, you quite often have no idea which thread stomps on your data, so I'd imagine most of the time you do want a global watchpoint. Note, that if you want to restrict your watchpoint to one thread, you can probably (I don't know and I haven't checked) do this with gdb "command" that "continue"s if it's on the wrong thread. > 5. Do not allow to mix PT_STEP and hardware watchpoint, in case of > single-stepping the code, disable (it means: don't set) hardware > watchpoints for threads. Some platforms might implement single-step with > hardware watchpoints and managing both at the same time is generating > extra pointless complexity. I don't think I see how "extra pointless complexity" follows. Also, you might want both, single-stepping and waiting for a watchpoint. Will debugger have switch dynamically to software watchpoints when single-stepping? Can it even do that already? In general I'd appreciate if handwavy "this is pointless/extra complexity" arguments were spelled out. They might be obvious to you, but most people reading this don't have relevant information swapped in, or don't know enough details. -uwe
Re: CVS commit: src/sys/dev/pci
On Tue, Nov 29, 2016 at 21:54:11 +, Valeriy E. Ushakov wrote: > Module Name: src > Committed By: uwe > Date: Tue Nov 29 21:54:11 UTC 2016 > > Modified Files: > src/sys/dev/pci: if_vioif.c > > Log Message: > vioif_start() - do not call virtio_enqueue_abort() after error from > virtio_enqueue_reserve(), as it's already done by the latter, so we > ended up with a kind of "double free" that messed up out free list of > vq_entry's. > > This is even documented in a "typical usage" comment in virtio.c (and > those quotes are not intended to be sarcastic). > > PR 51132 - virtio net device stuck for UDP burst transmission > > > To generate a diff of this commit: > cvs rdiff -u -r1.26 -r1.27 src/sys/dev/pci/if_vioif.c This seems to be a common problem, as both ld at virtio and viornd drivers do the same mistake too. I'd appreciate if people can fix and test (with simulated failure if necessary). I wonder if http://gnats.netbsd.org/50604 might be caused by this as the first time you run out of vq_entry's, you will end up with a messed up free list. -uwe
Re: SOSEND_LOAN problems in MIPS
On Sun, Jun 19, 2016 at 16:25:20 +0100, Robert Swindells wrote: > co...@sdf.org wrote: > >in emulating pmax with gxemul I had trouble using: > > cat somefile | command > > > >when somefile is bigger than 4096 bytes. [...] > It would probably also help to know where the file that you read > using cat was stored, was it read over NFS? More likely options PIPE_SOCKETPAIR, I guess. -uwe
Re: Scripting DDB in Forth?
On Mon, May 02, 2016 at 04:59:32 +0300, Valery Ushakov wrote: > I'd say that someone familiar with the target ISA can port == write > the asm core in an evening or two. Just a quick follow up note. I wanted to verify that claim, so I ported it to powerpc, which also helped to make the MI part really (or at least more) MI. I had zero ppc knowledge before starting that exercise. It took me three evenings, four if you count reading up on the ISA. https://bitbucket.org/nbuwe/forth -uwe