Re: Potential new syscall
On 03.04.2018 16:57, Mouse wrote: >> =46rom the GDB protocol point of view, > ...what does gdb have to do with it? Did I miss something? > We need to track forks and its variations. Just a note that from the existing protocol point of view, we can handle this new syscall. signature.asc Description: OpenPGP digital signature
Re: Potential new syscall
> =46rom the GDB protocol point of view, ...what does gdb have to do with it? Did I miss something? > I think there is needed prior verification of its stability and > benchmarking before the final decision. I would expect such work to be done before it goes into the main NetBSD tree. What I have is a proof-of-concept implementation and, for anyone willing to run by 5.2 variant, or willing to port what I've done to stock 5.2 (which would probably be easy), or port what I have to -current (which I can only speculate about), it can provide something to test and benchmark. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Potential new syscall
>> [...] - "just use fork" is a very common response, but no matter how >> fork gets implemented, vfork() when used correctly always performs >> better by huge margins. > But most of those cases are handled just as well by posix_spawn. Possibly - but most of a system's operation is handled perfectly well by no more than a few dozen syscalls. Is that a reason to get rid of the rest? If you want, sure, use posix_spawn when it's applicable. But it's also nice to have something that can handle the cases where it _isn't_ applicable - which is, in a sense, what fork() is for, but it's also nice to not cripple performance unnecessarily. And, in my case, the only easy answer was to make vfork() equivalent to fork() in the _emulated_ system, which I consider a last-ditch fallback. The new syscall is almost as easy (for me) and much closer to correct. /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Potential new syscall
>> Basically, what I want is a syscall that a vfork()ed child can call >> to have the unsharing effects of execve(2) or _exit(2) [...] > I have considered (and think I mentioned on some list some time ago) > the same capability to improve sh performance. > [...] > Having the mechanism available for testing (even if it was not > committed to the standard NetBSD sources (yet?)) Certainly not yet; at present, it exists on only one machine. I don't run -current, so it's not appropriate for me to try to put it into -current, but I would be fine with someone else doing so. The "one machine" that has it now is running (my evolution of) 5.2. On my morning commute today I tested it, and it works, for very rudimentary smoke-test values of "works". > Kamil - "just use fork" is a very common response, but no matter how > fork gets implemented, vfork() when used correctly always performs > better by huge margins. Which of course is why vfork exists at all. :-) > You are of course correct that there is a very limited set of > functions possible in a vfork()'d child - I disagree. The set of functions usable in a vfork()ed child is actually quite wide on most systems - but you have to know a good deal about the implementation of vfork() and the functions in question to know which ones are safe and why, and how to safely use the ones you can. (For example, on the 5.2 I'm working on, I can printf() from the child just fine, provided I fflush() at suitable times so that stdio's internals don't get confused.) The set of functions you can use narrows as you care about wider and wider portability, to the point where, if you're trying to be portable to anything POSIX, vfork() is basically useless (because you can't do any of the usual post-fork pre-exec prep). /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Potential new syscall
>> Basically, what I want is a syscall that a vfork()ed child can call >> to have the unsharing effects of execve(2) or _exit(2) (return the >> vmspace to the parent and let the it continue), while the child >> carries on with a clone of the vmspace [...] > That sounds suspiciously like Linux's unshare(2) call, with the > CLONE_VM option. Yes, except that (based on reading the unshare(2) manpage on a work machine) unshare(CLONE_VM) doesn't have the "let the parent continue" semantic that my putative syscall does. What I want could perhaps be called unshare(CLONE_VFORK) (which doesn't seem to exist). /~\ The ASCII Mouse \ / Ribbon Campaign X Against HTMLmo...@rodents-montreal.org / \ Email! 7D C8 61 52 5D E7 2D 39 4E F1 31 3E E8 B3 27 4B
Re: Potential new syscall
On Tue, Apr 03, 2018 at 09:08:15AM +0700, Robert Elz wrote: > Kamil - "just use fork" is a very common response, but no matter how > fork gets implemented, vfork() when used correctly always performs > better by huge margins. But most of those cases are handled just as well by posix_spawn. Which doesn't have any of the thread-safety issues that vfork has. Joerg
re: Fixing excessive shootdowns during FFS read (kern/53124) on x86 - emap, direct map
> 4GB of KVA; and in addition pmap_kernel will consume some KVA too. i386 for > example has only 4GB of ram and 4GB of KVA, so we'll just never add a direct > map there. note that it changes your direct map point, but, i386 GENERIC_PAE works fine for at least 16GB ram. it should work upto 64GB. actually, it only strengthens your direct map point, since it's even harder to fit 64GB phys into 4GB virt :-) > As opposed to that, emaps can be implemented everywhere with no constraint on > the arch. I think they are better than the direct map for uvm_bio. IIRC, the main reason we stopped using emap is that there were performance issues. rmind? .mrg.
Re: Fixing excessive shootdowns during FFS read (kern/53124) on x86 - emap, direct map
Le 02/04/2018 à 21:28, Jaromír Doleček a écrit : 2018-03-31 13:42 GMT+02:00 Jaromír Doleček: 2018-03-25 17:27 GMT+02:00 Joerg Sonnenberger : Yeah, that's what ephemeral mappings where supposed to be for. The other question is whether we can't just use the direct map for this on amd64 and similar platforms? Right, we could/should use emap. I haven't realized emap is actually already implemented. It's currently used for pipe for the loan/"direct" write. I don't know anything about emap thought. Are there any known issues, do you reckon it's ready to be used for general I/O handling? Okay, so I've hacked to gether a patch to switch uvm_bio.c to ephemeral mapping: http://www.netbsd.org/~jdolecek/uvm_bio_emap.diff - pmap_emap_enter(va, pa, VM_PROT_READ); + pmap_emap_enter(va, pa, VM_PROT_READ | VM_PROT_WRITE); Mmh no, sys_pipe wanted it read-only, we shouldn't make it writable by default. Adding a prot argument to uvm_emap_enter would be better. Looking at the state of usage though, the emap is only used for disabled code path for sys_pipe and nowhere else. That code had several on-and-off passes for being enabled in 2009, and no further use since then. Doesn't give too much confidence. The only port actually having optimization for emap is x86. Since amd64 is also the only one supporting direct map, we are really at liberty to pick either one. I'd lean towards direct map, since that doesn't require adding/removing any mapping in pmap_kernel() at all. From looking on the code, I gather direct map is quite easy to implement for other archs like sparc64. I'd say significantly easier than adding the necessary emap hooks into MD pmaps. There is a good number of architectures where implementing a direct map is not possible, because of KVA consumption. With a direct map we consume more than once the physical space. If you have 4GB of ram, the direct map will consume 4GB of KVA; and in addition pmap_kernel will consume some KVA too. i386 for example has only 4GB of ram and 4GB of KVA, so we'll just never add a direct map there. Direct maps are good when the architecture has much, much more KVA than it has physical space. I saw some low-KVA architectures have a "partial direct map", where only a (tiny) area of the physical space is direct-mapped. There, we would have to adapt uvm_bio to use pmap_kernel instead, which seems ugly. As opposed to that, emaps can be implemented everywhere with no constraint on the arch. I think they are better than the direct map for uvm_bio.