Re: Regression: Ctrl-c from the pager in an alias exits it
On Fri, Jan 06, 2017 at 11:42:25PM +0100, Johannes Sixt wrote: > > So I dunno. Maybe this waiting should be restricted only to certain > > cases like executing git sub-commands. > > If given it some thought. > > In general, I think it is wrong to wait for child processes when a signal > was received. After all, it is the purpose of a (deadly) signal to have the > process go away. There may be programs that know it better, like less, but > git should not attempt to know better in general. > > We do apply some special behavior for certain cases like we do for the > pager. And now the case with aliases is another special situation. The > parent git process only delegates to the child, and as such it is reasonable > that it binds its life time to the first child, which executes the expanded > alias. Yeah, I think I agree. That binding is something you want in many cases, but not necessarily all. The original purpose of clean_on_exit was to create a binding like that, but of course it can be (and with the smudge-filter stuff, arguably has been) used for other cases, too. I'll work up a patch that makes it a separate option, which should be pretty easy. -Peff
Re: Regression: Ctrl-c from the pager in an alias exits it
Am 06.01.2017 um 20:41 schrieb Jeff King: On Fri, Jan 06, 2017 at 03:39:59PM +0100, Johannes Sixt wrote: diff --git a/run-command.c b/run-command.c index ca905a9e80..db47c429b7 100644 --- a/run-command.c +++ b/run-command.c @@ -29,6 +29,8 @@ static int installed_child_cleanup_handler; static void cleanup_children(int sig, int in_signal) { + struct child_to_clean *children_to_wait_for = NULL; + while (children_to_clean) { struct child_to_clean *p = children_to_clean; children_to_clean = p->next; @@ -45,6 +47,17 @@ static void cleanup_children(int sig, int in_signal) } kill(p->pid, sig); + p->next = children_to_wait_for; + children_to_wait_for = p; + } + + while (children_to_wait_for) { + struct child_to_clean *p = children_to_wait_for; + children_to_wait_for = p->next; + + while (waitpid(p->pid, NULL, 0) < 0 && errno == EINTR) + ; /* spin waiting for process exit or error */ + if (!in_signal) free(p); } This looks like the minimal change necessary. I wonder, though, whether the new local variable is really required. Wouldn't it be sufficient to walk the children_to_clean chain twice? Yeah, I considered that. The fact that we disassemble the list in the first loop has two side effects: 1. It lets us free the list as we go (for the !in_signal case). 2. If we were to get another signal, it makes us sort-of reentrant. We will only kill and wait for each pid once. Obviously (1) moves down to the lower loop, but I was trying to preserve (2). I'm not sure if it is worth bothering, though. Makes sense. The way we pull items off of the list is certainly not atomic (it does shorten the race to a few instructions, though, versus potentially waiting on waitpid() to return). My bigger concern with the whole thing is whether we could hit some sort of deadlock if the child doesn't die when we send it a signal. E.g., imagine we have a pipe open to the child and somebody sends SIGTERM to us. We propagate SIGTERM to the child, and then waitpid() for it. The child decides to ignore our SIGTERM for some reason and keep reading until EOF on the pipe. It won't ever get it, and the two processes will hang forever. You can argue perhaps that the child is broken in that case. And I doubt this could trigger when running a git sub-command. But we may add more children in the future. Right now we use it for the new multi-file clean/smudge filters. They use the hook feature to close the descriptors, but note that that won't run in the in_signal case. So I dunno. Maybe this waiting should be restricted only to certain cases like executing git sub-commands. If given it some thought. In general, I think it is wrong to wait for child processes when a signal was received. After all, it is the purpose of a (deadly) signal to have the process go away. There may be programs that know it better, like less, but git should not attempt to know better in general. We do apply some special behavior for certain cases like we do for the pager. And now the case with aliases is another special situation. The parent git process only delegates to the child, and as such it is reasonable that it binds its life time to the first child, which executes the expanded alias. -- Hannes
Re: Regression: Ctrl-c from the pager in an alias exits it
On Fri, Jan 06, 2017 at 03:39:59PM +0100, Johannes Sixt wrote: > > diff --git a/run-command.c b/run-command.c > > index ca905a9e80..db47c429b7 100644 > > --- a/run-command.c > > +++ b/run-command.c > > @@ -29,6 +29,8 @@ static int installed_child_cleanup_handler; > > > > static void cleanup_children(int sig, int in_signal) > > { > > + struct child_to_clean *children_to_wait_for = NULL; > > + > > while (children_to_clean) { > > struct child_to_clean *p = children_to_clean; > > children_to_clean = p->next; > > @@ -45,6 +47,17 @@ static void cleanup_children(int sig, int in_signal) > > } > > > > kill(p->pid, sig); > > + p->next = children_to_wait_for; > > + children_to_wait_for = p; > > + } > > + > > + while (children_to_wait_for) { > > + struct child_to_clean *p = children_to_wait_for; > > + children_to_wait_for = p->next; > > + > > + while (waitpid(p->pid, NULL, 0) < 0 && errno == EINTR) > > + ; /* spin waiting for process exit or error */ > > + > > if (!in_signal) > > free(p); > > } > > > > This looks like the minimal change necessary. I wonder, though, whether the > new local variable is really required. Wouldn't it be sufficient to walk the > children_to_clean chain twice? Yeah, I considered that. The fact that we disassemble the list in the first loop has two side effects: 1. It lets us free the list as we go (for the !in_signal case). 2. If we were to get another signal, it makes us sort-of reentrant. We will only kill and wait for each pid once. Obviously (1) moves down to the lower loop, but I was trying to preserve (2). I'm not sure if it is worth bothering, though. The way we pull items off of the list is certainly not atomic (it does shorten the race to a few instructions, though, versus potentially waiting on waitpid() to return). My bigger concern with the whole thing is whether we could hit some sort of deadlock if the child doesn't die when we send it a signal. E.g., imagine we have a pipe open to the child and somebody sends SIGTERM to us. We propagate SIGTERM to the child, and then waitpid() for it. The child decides to ignore our SIGTERM for some reason and keep reading until EOF on the pipe. It won't ever get it, and the two processes will hang forever. You can argue perhaps that the child is broken in that case. And I doubt this could trigger when running a git sub-command. But we may add more children in the future. Right now we use it for the new multi-file clean/smudge filters. They use the hook feature to close the descriptors, but note that that won't run in the in_signal case. So I dunno. Maybe this waiting should be restricted only to certain cases like executing git sub-commands. -Peff
Re: Regression: Ctrl-c from the pager in an alias exits it
Am 06.01.2017 um 08:32 schrieb Jeff King: On Fri, Jan 06, 2017 at 02:26:02AM -0500, Jeff King wrote: You'll notice that it actually calls wait() on the pager. That's due to a3da882120 (pager: do wait_for_pager on signal death, 2009-01-22), which IIRC was addressing a very similar problem. We want to stop feeding the pager when we get a signal, but we don't want the main process to actually exit, or the pager loses the controlling terminal. In our new scenario we have an extra process, though. The git-log child will wait on the pager, but the parent process can't. It doesn't know about it. I think that it in turn needs to wait on the child when it dies, and then the whole chain will stand still until the pager exits. And here's a patch to do that. It seems to work. I'll sleep on it and then write up a commit message tomorrow if it still makes sense. diff --git a/run-command.c b/run-command.c index ca905a9e80..db47c429b7 100644 --- a/run-command.c +++ b/run-command.c @@ -29,6 +29,8 @@ static int installed_child_cleanup_handler; static void cleanup_children(int sig, int in_signal) { + struct child_to_clean *children_to_wait_for = NULL; + while (children_to_clean) { struct child_to_clean *p = children_to_clean; children_to_clean = p->next; @@ -45,6 +47,17 @@ static void cleanup_children(int sig, int in_signal) } kill(p->pid, sig); + p->next = children_to_wait_for; + children_to_wait_for = p; + } + + while (children_to_wait_for) { + struct child_to_clean *p = children_to_wait_for; + children_to_wait_for = p->next; + + while (waitpid(p->pid, NULL, 0) < 0 && errno == EINTR) + ; /* spin waiting for process exit or error */ + if (!in_signal) free(p); } This looks like the minimal change necessary. I wonder, though, whether the new local variable is really required. Wouldn't it be sufficient to walk the children_to_clean chain twice? -- Hannes
Re: Regression: Ctrl-c from the pager in an alias exits it
On Fri, Jan 06, 2017 at 02:32:25 -0500, Jeff King wrote: > And here's a patch to do that. It seems to work. Nice, thanks! This works very well for me too. -- Trygve Aaberge
Re: Regression: Ctrl-c from the pager in an alias exits it
On Fri, Jan 06, 2017 at 02:26:02AM -0500, Jeff King wrote: > You'll notice that it actually calls wait() on the pager. That's due to > a3da882120 (pager: do wait_for_pager on signal death, 2009-01-22), which > IIRC was addressing a very similar problem. We want to stop feeding the > pager when we get a signal, but we don't want the main process to > actually exit, or the pager loses the controlling terminal. > > In our new scenario we have an extra process, though. The git-log child > will wait on the pager, but the parent process can't. It doesn't know > about it. I think that it in turn needs to wait on the child when it > dies, and then the whole chain will stand still until the pager exits. And here's a patch to do that. It seems to work. I'll sleep on it and then write up a commit message tomorrow if it still makes sense. diff --git a/run-command.c b/run-command.c index ca905a9e80..db47c429b7 100644 --- a/run-command.c +++ b/run-command.c @@ -29,6 +29,8 @@ static int installed_child_cleanup_handler; static void cleanup_children(int sig, int in_signal) { + struct child_to_clean *children_to_wait_for = NULL; + while (children_to_clean) { struct child_to_clean *p = children_to_clean; children_to_clean = p->next; @@ -45,6 +47,17 @@ static void cleanup_children(int sig, int in_signal) } kill(p->pid, sig); + p->next = children_to_wait_for; + children_to_wait_for = p; + } + + while (children_to_wait_for) { + struct child_to_clean *p = children_to_wait_for; + children_to_wait_for = p->next; + + while (waitpid(p->pid, NULL, 0) < 0 && errno == EINTR) + ; /* spin waiting for process exit or error */ + if (!in_signal) free(p); }
Re: Regression: Ctrl-c from the pager in an alias exits it
On Fri, Jan 06, 2017 at 01:47:52AM -0500, Jeff King wrote: > > > To reproduce, create e.g. the alias `l = log` and run `git l`. Then press > > > Ctrl-c. The expected behavior is that nothing happens. The actual > > > behavior is > > > that the pager exits. > > > > That's weird. I can't reproduce at all here. But I also can't think of a > > thing that Git could do that would impact the behavior. For example: > > I take it back. I could not reproduce under my normal config, which sets > the pager to "diff-highlight | less". But if I drop that config, I can > reproduce easily. And bisect it to that same commit, 86d26f240f > (setup.c: re-fix d95138e (setup: set env $GIT_WORK_TREE when .., > 2015-12-20). I made a little progress with strace. Prior to 86d26f240f, we would run the "log" builtin directly inside the same executable. After it, we exec a separate process, and the process tree looks like: | `-bash,10123 | `-git,10125 l | `-git-log,10127 | `-less,10128 Now if I strace all of those, I see (I've reordered and snipped a bit for clarity): $ strace -p 10125 -p 10127 -p 10128 strace: Process 10125 attached strace: Process 10127 attached strace: Process 10128 attached [pid 10127] write(1, "\n\33[33mcommit bae73e80d48ace1faa3"..., 1481 [pid 10128] read(5, [pid 10125] wait4(10127, The main git process is waiting for the child to finish, the child is blocked writing to the pager, and the pager is waiting for input from the terminal (fd 5). So I hit ^C: [pid 10128] <... read resumed> 0x7ffd39153b57, 1) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) [pid 10127] <... write resumed> ) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) [pid 10125] <... wait4 resumed> 0x7ffe88d0a560, 0, NULL) = ? ERESTARTSYS (To be restarted if SA_RESTART is set) [pid 10128] --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} --- [pid 10127] --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} --- [pid 10125] --- SIGINT {si_signo=SIGINT, si_code=SI_KERNEL} --- Everybody gets the signal... [pid 10127] write(1, "\n\33[33mcommit bae73e80d48ace1faa3"..., 1481 The blocked writer will resume its write() until it returns. This is the same as it would get under the older code, too (after write() returns it will handle the signal and die). [pid 10125] kill(10127, SIGINT [pid 10125] <... kill resumed> )= 0 The parent git process tries to propagate the signal to the child. Unnecessary in this instance, but helpful when the signal is only delivered to the parent. This is due to [pid 10128] rt_sigaction(SIGINT, {sa_handler=0x558dd1af0300, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f03fc8c2040}, [pid 10128] <... rt_sigaction resumed> {sa_handler=0x558dd1af0300, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7f03fc8c2040}, 8) = 0 [pid 10128] rt_sigprocmask(SIG_SETMASK, [], [pid 10128] <... rt_sigprocmask resumed> NULL, 8) = 0 [pid 10128] write(1, "\7\r\33[K:\33[K", 9 [pid 10128] <... write resumed> ) = 9 [pid 10128] read(5, And here's the pager handling the signal, and going back to waiting for terminal input. [pid 10125] rt_sigaction(SIGINT, {sa_handler=SIG_DFL, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fbe2b4c1040}, [pid 10125] <... rt_sigaction resumed> {sa_handler=0x55aec373a6a0, sa_mask=[INT], sa_flags=SA_RESTORER|SA_RESTART, sa_restorer=0x7fbe2b4c1040}, 8) = 0 [pid 10125] rt_sigprocmask(SIG_BLOCK, ~[RTMIN RT_1], [INT], 8) = 0 [pid 10125] getpid( [pid 10125] <... getpid resumed> ) = 10125 [pid 10125] gettid()= 10125 [pid 10125] tgkill(10125, 10125, SIGINT) = 0 [pid 10125] rt_sigprocmask(SIG_SETMASK, [INT], NULL, 8) = 0 [pid 10125] rt_sigreturn({mask=[]}) = 61 [pid 10125] --- SIGINT {si_signo=SIGINT, si_code=SI_TKILL, si_pid=10125, si_uid=1000} --- [pid 10125] +++ killed by SIGINT +++ The parent process pops the signal handler and propagates to itself, dying. At this point things pause, and nothing happens. But as soon as I hit a key, the pager dies: [pid 10128] <... read resumed> "\r", 1) = 1 [pid 10128] write(1, "\r\33[K", 4) = 4 [pid 10128] write(1, " \"Another round of MIPS fixe"..., 50) = 50 [pid 10128] read(5, 0x7ffd39153b57, 1) = -1 EIO (Input/output error) [pid 10128] write(1, "\r\33[K\33[?1l\33>", 11) = 11 [pid 10128] fsync(5)= -1 EINVAL (Invalid argument) [pid 10128] ioctl(5, TCGETS, {B38400 opost isig -icanon -echo ...}) = 0 [pid 10128] ioctl(5, SNDCTL_TMR_STOP or TCSETSW, {B38400 opost isig icanon echo ...}) = -1 EIO (Input/output error) [pid 10128] exit_group(1) = ? [pid 10128] +++ exited with 1 +++ The key thing is the EIO we get reading from the terminal. I think because the head of the process group died, we lost our controlling terminal. And then naturally the git-log process gets
Re: Regression: Ctrl-c from the pager in an alias exits it
On Fri, Jan 06, 2017 at 01:40:32AM -0500, Jeff King wrote: > On Thu, Jan 05, 2017 at 03:25:29PM +0100, Trygve Aaberge wrote: > > > I'm experiencing an issue when using aliases for commands that open the > > pager. > > When I press Ctrl-c from the pager, it exits. This does not happen when I > > don't use an alias and did not happen before. It causes problems because > > Ctrl-c is also used for other things, such as canceling a search that hasn't > > completed. > > > > To reproduce, create e.g. the alias `l = log` and run `git l`. Then press > > Ctrl-c. The expected behavior is that nothing happens. The actual behavior > > is > > that the pager exits. > > That's weird. I can't reproduce at all here. But I also can't think of a > thing that Git could do that would impact the behavior. For example: I take it back. I could not reproduce under my normal config, which sets the pager to "diff-highlight | less". But if I drop that config, I can reproduce easily. And bisect it to that same commit, 86d26f240f (setup.c: re-fix d95138e (setup: set env $GIT_WORK_TREE when .., 2015-12-20). -Peff
Re: Regression: Ctrl-c from the pager in an alias exits it
On Thu, Jan 05, 2017 at 03:25:29PM +0100, Trygve Aaberge wrote: > I'm experiencing an issue when using aliases for commands that open the pager. > When I press Ctrl-c from the pager, it exits. This does not happen when I > don't use an alias and did not happen before. It causes problems because > Ctrl-c is also used for other things, such as canceling a search that hasn't > completed. > > To reproduce, create e.g. the alias `l = log` and run `git l`. Then press > Ctrl-c. The expected behavior is that nothing happens. The actual behavior is > that the pager exits. That's weird. I can't reproduce at all here. But I also can't think of a thing that Git could do that would impact the behavior. For example: 1. Git should never kill() the pager; in fact it waits for it to finish. 2. Git can impact the pager environment and command line options, but those haven't changed anytime recently (and besides, I'm not sure there even is an option to convince less to die). 3. Git can impact the set of blocked signals that the pager sees, but I think less would set up its own handler for SIGINT anyway. I suppose it's possible that your shell or another program (tmux, maybe?) catches the SIGINT and does a process-group kill. But then I don't know why it would matter that you're using an alias. The process tree without an alias: |-xterm,21861,peff | `-bash,21863 | `-git,22376 log | `-less,22377 and with: |-xterm,21861,peff | `-bash,21863 | `-git,22391 l | `-git-log,22393 | `-less,22394 are pretty similar. Puzzling. -Peff
Regression: Ctrl-c from the pager in an alias exits it
I'm experiencing an issue when using aliases for commands that open the pager. When I press Ctrl-c from the pager, it exits. This does not happen when I don't use an alias and did not happen before. It causes problems because Ctrl-c is also used for other things, such as canceling a search that hasn't completed. To reproduce, create e.g. the alias `l = log` and run `git l`. Then press Ctrl-c. The expected behavior is that nothing happens. The actual behavior is that the pager exits. I bisected the repo, and found that the commit 86d26f240 [0] introduced the issue. [0]: 86d26f240 (setup.c: re-fix d95138e (setup: set env $GIT_WORK_TREE when .. - 2015-12-20) -- Trygve Aaberge