Re: [TANGENT] run-command: use vfork instead of fork

2017-05-16 Thread Brandon Williams
On 05/16, Linus Torvalds wrote:
> On Tue, May 16, 2017 at 12:35 PM, Eric Wong  wrote:
> >
> > Fwiw, most of the vfork preparation was already done by Brandon
> > and myself a few weeks ago, and cooking in pu.
> 
> Oh, interesting. Was that done for vfork(), or is it for something
> else? Some of the changes seem almost overly careful. Is this for
> prep-work porting to some odd environment that doesn't really have a
> MMU at all? There's nothing fundamentally wrong with allocating memory
> after fork().
> 
> But yes, it looks like it helps the vfork case.
> 
>Linus

I started working on the run-command code when I ran into a deadlock in
'git grep --recurse-submodules'.  When I added support for submodules to
grep I just assumed that launching a process (which atm is unfortunately
the only way to work on a submodule) would work in a multi-threaded
environment.  I was naive and wrong!

The deadlock was due to a malloc lock being held by thread 'A' while
thread 'b' tried to launch a process.  Since that lock was in a
locked-state at the time of forking, it remained in a locked-state with
no hope of ever being released.  So when the child process that thread
'b' spawned tried to malloc a chunk of memory after forking, it
deadlocked.

I didn't catch this in initial testing because gclib registers
atfork_handelers in order to prevent this sort of thing, while libraries
like tcmalloc don't do this.

So to account for this, I worked to make run-command safe to use in the
presence of threads, which had the benefit of also preparing it to be
vfork() ready.

Ultimately I'd like to drop the requirement to spawn a child process to
work on a submodule, but that's going to take a lot more effort.

-- 
Brandon Williams


Re: [TANGENT] run-command: use vfork instead of fork

2017-05-16 Thread Linus Torvalds
On Tue, May 16, 2017 at 12:35 PM, Eric Wong  wrote:
>
> Fwiw, most of the vfork preparation was already done by Brandon
> and myself a few weeks ago, and cooking in pu.

Oh, interesting. Was that done for vfork(), or is it for something
else? Some of the changes seem almost overly careful. Is this for
prep-work porting to some odd environment that doesn't really have a
MMU at all? There's nothing fundamentally wrong with allocating memory
after fork().

But yes, it looks like it helps the vfork case.

   Linus


[TANGENT] run-command: use vfork instead of fork

2017-05-16 Thread Eric Wong
Linus Torvalds  wrote:
> Also, if people really want to optimize the code that executes an
> external program (whether in shell or directly), I think it might be
> worth it to look at replacing the "fork()" with a "vfork()".
> 
> Something like this
> 
> -   cmd->pid = fork();
> +   cmd->pid = (cmd->git_cmd || cmd->env) ? fork() : vfork();
> 
> might work (the native git_cmd case needs a real fork, and if we
> change the environment variables we need it too, but the other cases
> look like they might work with vfork()).
> 
> Using vfork() can be hugely more efficient, because you don't have the
> extra page table copies and teardown, but also avoid a lot of possible
> copy-on-write faults.

Fwiw, most of the vfork preparation was already done by Brandon
and myself a few weeks ago, and cooking in pu.

I think only the patch below would be needed to enable vfork
(along with any build-time detection)

However, I haven't noticed enough forking in git to make a
difference (but maybe others do).  I think it would make a
bigger difference if such changes were made to bash, dash,
make, and perl5.

8<
Subject: [PATCH] run-command: use vfork instead of fork

To enable vfork, we merely have to avoid modifying memory we
share with the parent, so the guard functions
`child_(error|warn|die)_fn` can now be disabled.

FIXME: still missing autoconf + Makefile portability tweaks.

Signed-off-by: Eric Wong 
---
 run-command.c | 28 +---
 1 file changed, 1 insertion(+), 27 deletions(-)

diff --git a/run-command.c b/run-command.c
index 9e36151bf9..0292dd94b6 100644
--- a/run-command.c
+++ b/run-command.c
@@ -324,25 +324,6 @@ static void fake_fatal(const char *err, va_list params)
vreportf("fatal: ", err, params);
 }
 
-static void child_error_fn(const char *err, va_list params)
-{
-   const char msg[] = "error() should not be called in child\n";
-   xwrite(2, msg, sizeof(msg) - 1);
-}
-
-static void child_warn_fn(const char *err, va_list params)
-{
-   const char msg[] = "warn() should not be called in child\n";
-   xwrite(2, msg, sizeof(msg) - 1);
-}
-
-static void NORETURN child_die_fn(const char *err, va_list params)
-{
-   const char msg[] = "die() should not be called in child\n";
-   xwrite(2, msg, sizeof(msg) - 1);
-   _exit(2);
-}
-
 /* this runs in the parent process */
 static void child_err_spew(struct child_process *cmd, struct child_err *cerr)
 {
@@ -658,17 +639,10 @@ int start_command(struct child_process *cmd)
 * never be released in the child process.  This means only
 * Async-Signal-Safe functions are permitted in the child.
 */
-   cmd->pid = fork();
+   cmd->pid = vfork();
failed_errno = errno;
if (!cmd->pid) {
int sig;
-   /*
-* Ensure the default die/error/warn routines do not get
-* called, they can take stdio locks and malloc.
-*/
-   set_die_routine(child_die_fn);
-   set_error_routine(child_error_fn);
-   set_warn_routine(child_warn_fn);
 
close(notify_pipe[0]);
set_cloexec(notify_pipe[1]);
-- 
Also fetchable:  git://bogomips.org/git-svn vfork-test
commit 5f88d79182aaabc5ea467d1d29e13e45bd2b99bf