Re: Potential new syscall

2018-04-07 Thread Robert Elz
Date:Sat, 7 Apr 2018 08:36:25 -0400 (EDT)
From:Mouse 
Message-ID:  <201804071236.iaa14...@stone.rodents-montreal.org>

  | kre, assuming you want it now, how would be most useful to get it to
  | you?

e-mail patches or the modified source files (I can easily extract the
original 5.2 versions to compare).

  | What I have is a commit in my based-at-5.2 src git tree.  I can
  | extract diffs easily,

That would be best.

  | or you can clone the git repo

I don't use git - never have ...

Thanks,

kre



Re: Potential new syscall

2018-04-07 Thread Mouse
>> ["break vfork sharing" syscall]
> Having the mechanism available for testing (even if it was not
> committed to the standard NetBSD sources (yet?)) would be a real
> help, [...]

It works, at least for rudimentary values of "works": a test program
behaves as it should, and the emulator that prompted me to create this
works for small numbers of runs.  I haven't constructed a test to
hammer it for millions or billions of calls; there could well be a rare
failure mode, like a race somewhere.

kre, assuming you want it now, how would be most useful to get it to
you?  What I have is a commit in my based-at-5.2 src git tree.  I can
extract diffs easily, or you can clone the git repo and look at the
commit (git://git.rodents-montreal.org/Mouse/netbsd-fork/5.2/src commit
3bc0da98f79eb0115f5c4992d7b42b6623ae7b78), or if you have something
else to suggest I'm listening.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Potential new syscall

2018-04-03 Thread Kamil Rytarowski
On 03.04.2018 16:57, Mouse wrote:
>> =46rom the GDB protocol point of view,
> ...what does gdb have to do with it?  Did I miss something?
> 

We need to track forks and its variations. Just a note that from the
existing protocol point of view, we can handle this new syscall.



signature.asc
Description: OpenPGP digital signature


Re: Potential new syscall

2018-04-03 Thread Mouse
> =46rom the GDB protocol point of view,

...what does gdb have to do with it?  Did I miss something?

> I think there is needed prior verification of its stability and
> benchmarking before the final decision.

I would expect such work to be done before it goes into the main NetBSD
tree.  What I have is a proof-of-concept implementation and, for anyone
willing to run by 5.2 variant, or willing to port what I've done to
stock 5.2 (which would probably be easy), or port what I have to
-current (which I can only speculate about), it can provide something
to test and benchmark.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Potential new syscall

2018-04-03 Thread Mouse
>> [...] - "just use fork" is a very common response, but no matter how
>> fork gets implemented, vfork() when used correctly always performs
>> better by huge margins.
> But most of those cases are handled just as well by posix_spawn.

Possibly - but most of a system's operation is handled perfectly well
by no more than a few dozen syscalls.  Is that a reason to get rid of
the rest?

If you want, sure, use posix_spawn when it's applicable.  But it's also
nice to have something that can handle the cases where it _isn't_
applicable - which is, in a sense, what fork() is for, but it's also
nice to not cripple performance unnecessarily.  And, in my case, the
only easy answer was to make vfork() equivalent to fork() in the
_emulated_ system, which I consider a last-ditch fallback.  The new
syscall is almost as easy (for me) and much closer to correct.

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Potential new syscall

2018-04-03 Thread Mouse
>> Basically, what I want is a syscall that a vfork()ed child can call
>> to have the unsharing effects of execve(2) or _exit(2) [...]
> I have considered (and think I mentioned on some list some time ago)
> the same capability to improve sh performance.
> [...]
> Having the mechanism available for testing (even if it was not
> committed to the standard NetBSD sources (yet?))

Certainly not yet; at present, it exists on only one machine.  I don't
run -current, so it's not appropriate for me to try to put it into
-current, but I would be fine with someone else doing so.

The "one machine" that has it now is running (my evolution of) 5.2.  On
my morning commute today I tested it, and it works, for very
rudimentary smoke-test values of "works".

> Kamil - "just use fork" is a very common response, but no matter how
> fork gets implemented, vfork() when used correctly always performs
> better by huge margins.

Which of course is why vfork exists at all. :-)

> You are of course correct that there is a very limited set of
> functions possible in a vfork()'d child -

I disagree.  The set of functions usable in a vfork()ed child is
actually quite wide on most systems - but you have to know a good deal
about the implementation of vfork() and the functions in question to
know which ones are safe and why, and how to safely use the ones you
can.  (For example, on the 5.2 I'm working on, I can printf() from the
child just fine, provided I fflush() at suitable times so that stdio's
internals don't get confused.)  The set of functions you can use
narrows as you care about wider and wider portability, to the point
where, if you're trying to be portable to anything POSIX, vfork() is
basically useless (because you can't do any of the usual post-fork
pre-exec prep).

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Potential new syscall

2018-04-03 Thread Mouse
>> Basically, what I want is a syscall that a vfork()ed child can call
>> to have the unsharing effects of execve(2) or _exit(2) (return the
>> vmspace to the parent and let the it continue), while the child
>> carries on with a clone of the vmspace [...]
> That sounds suspiciously like Linux's unshare(2) call, with the
> CLONE_VM option.

Yes, except that (based on reading the unshare(2) manpage on a work
machine) unshare(CLONE_VM) doesn't have the "let the parent continue"
semantic that my putative syscall does.  What I want could perhaps be
called unshare(CLONE_VFORK) (which doesn't seem to exist).

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B


Re: Potential new syscall

2018-04-03 Thread Joerg Sonnenberger
On Tue, Apr 03, 2018 at 09:08:15AM +0700, Robert Elz wrote:
> Kamil - "just use fork" is a very common response, but no matter how
> fork gets implemented, vfork() when used correctly always performs
> better by huge margins.

But most of those cases are handled just as well by posix_spawn. Which
doesn't have any of the thread-safety issues that vfork has.

Joerg


Re: Potential new syscall

2018-04-02 Thread Robert Elz
Date:Mon, 2 Apr 2018 19:29:25 -0400 (EDT)
From:Mouse 
Message-ID:  <201804022329.taa19...@stone.rodents-montreal.org>

  | Basically, what I want is a syscall that a vfork()ed child can call to
  | have the unsharing effects of execve(2) or _exit(2)

I have considered (and think I mentioned on some list some time
ago) the same capability to improve sh performance.

But as I don't want to think about pure performance changes until
I am happy that the functionality is as correct as I can make it,
pondering its usefulness is as far as I went so far.

Having the mechanism available for testing (even if it was not committed
to the standard NetBSD sources (yet?)) would be a real help, when it
comes time to evaluate whether that method, or converting sh to use
posix_spawn (an alternative that was suggested, but which I kind of
dread - considering the amount of work that would be involved) would
produce the best results.

Kamil - "just use fork" is a very common response, but no matter how
fork gets implemented, vfork() when used correctly always performs
better by huge margins.   You are of course correct that there is a very
limited set of functions possible in a vfork()'d child - which is why the
function proposed by Mouse would be useful - currently sh limits its
use of vfork() to the times it is very unlikely any of those will be be
needed, and uses fork() other times (costing performance if it was too
conservative) - and when the vfork() is attempted, and it turns out
something needs doing which won't work, the vfork()'d child (and whatever
work it had already done) is simply abandoned and it is all done again
after a fork() - also costing performance.   The ability to just convert a
vfork() into a fork() would avoid the waste in the latter case, and also allow
far more agressive use of vfork() to help avoid the former losses.

kre 



Re: Potential new syscall

2018-04-02 Thread Eric Hawicz

On 2018-04-02 07:29 PM, Mouse wrote:

Basically, what I want is a syscall that a vfork()ed child can call to
have the unsharing effects of execve(2) or _exit(2) (return the vmspace
to the parent and let the it continue), while the child carries on with
a clone of the vmspace without actually doing an exec or exit.  This is


That sounds suspiciously like Linux's unshare(2) call, with the CLONE_VM 
option.


Eric



Re: Potential new syscall

2018-04-02 Thread Kamil Rytarowski
On 03.04.2018 01:29, Mouse wrote:
> 
> Thoughts?  (Not restricted to just the above details; thoughts on the
> general idea would also be interesting to me.)
> 

It might not be a satisfying answer... but go for fork(). I recommend
this option - no kernel changes involved... unless someone wants good
performance when executing memcached.

fork() tends to replace vfork() in similar software.

And while there, calling other functions than exec() after vfork() is
less portable than vfork(). TSan notes that this is/was Undefined
Behavior and workaround issues in openjdk (which closes file
descriptors) with wrapping fork().



signature.asc
Description: OpenPGP digital signature


Potential new syscall

2018-04-02 Thread Mouse
I'm writing a userland emulator - basically, a hardware emulator,
except that it just runs userland; instructions that would trap to
privileged mode are instead handled by the emulator.

While doing this, I ran into a problem: vfork.  Most syscalls are
fairly straightforward, but vfork is a problem.  Most of the problems
I've dealt with easily enough, but there is one that I feel a need for
a new syscall for.  I had a look at the modern vfork(2) manpage via the
web interface, and the SEE ALSO section gives me no reason to think
such a thing exists even in modern NetBSD.  (I could just implement
vfork as fork, but I'd prefer to be a faithful emulation if I can.)

Basically, what I want is a syscall that a vfork()ed child can call to
have the unsharing effects of execve(2) or _exit(2) (return the vmspace
to the parent and let the it continue), while the child carries on with
a clone of the vmspace without actually doing an exec or exit.  This is
because the emulator does not exec in the hosting-OS sense when the
emulated process execs; I have found no other way to get the vfork
semantics right without forking-and-exiting, and that gets process
parenting wrong.  (This would be fixable by adding a manager process,
with everything for which parenting is relevant going via it, or by
having a single emulator process timeshare among all simulated
processes.  A new syscall seems cleaner to me, especially as it fills a
gap in the OS semantics.)

It looks to me as though there is uvm support for the concept, in the
form of vmspace_unshare(), in the version I'm working with; a little
searching makes it appear it's been diked out of uvm_map.c more
recently.  The syscall wrapper around it is only a few lines, basically
just uvmspace_unshare() plus, if PPWAIT is set, kicking the parent.

I offer for consideration the thought that something of the sort might
be worth adding.  I have code written, but I don't yet know whether
what I have works right - a test build is running as I type this - and
it's for a version years behind -current.

Thoughts?  (Not restricted to just the above details; thoughts on the
general idea would also be interesting to me.)

/~\ The ASCII Mouse
\ / Ribbon Campaign
 X  Against HTMLmo...@rodents-montreal.org
/ \ Email!   7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B