> On 01 Oct 2016, at 22:48, Jakub Narębski <[email protected]> wrote:
>
> W dniu 01.10.2016 o 20:59, Lars Schneider pisze:
>> On 29 Sep 2016, at 23:27, Junio C Hamano <[email protected]> wrote:
>>> Lars Schneider <[email protected]> writes:
>>>
>>> If the filter process refuses to die forever when Git told it to
>>> shutdown (by closing the pipe to it, for example), that filter
>>> process is simply buggy. I think we want users to become aware of
>>> that, instead of Git leaving it behind, which essentially is to
>>> sweep the problem under the rug.
>
> Well, it would be good to tell users _why_ Git is hanging, see below.
Agreed. Do you think it is OK to write the message to stderr?
>>> I agree with what Peff said elsewhere in the thread; if a filter
>>> process wants to take time to clean things up while letting Git
>>> proceed, it can do its own process management, but I think it is
>>> sensible for Git to wait the filter process it directly spawned.
>>
>> To realize the approach above I prototyped the run-command patch below:
>>
>> I added an "exit_timeout" variable to the "child_process" struct.
>> On exit, Git will close the pipe to the process and wait "exit_timeout"
>> seconds until it kills the child process. If "exit_timeout" is negative
>> then Git will wait until the process is done.
>
> That might be good approach. Probably the default would be to wait.
I think I would prefer a 2sec timeout or something as default. This way
we can ensure Git would not wait indefinitely for a buggy filter by default.
>> If we use that in the long running filter process, then we could make
>> the timeout even configurable. E.g. with "filter.<driver>.process-timeout".
>
> Sidenote: we prefer camelCase rather than kebab-case for config
> variables, that is, "filter.<driver>.processTimeout".
Sure!
> Also, how would one set default value of timeout for all process
> based filters?
I think we don't need that because a timeout is always specific
to a filter (if the 2sec default is not sufficient).
>>
>> + while ((waitpid(p->pid, &status, 0)) < 0 && errno ==
>> EINTR)
>> + ; /* nothing */
>
> Ah, this loop is here because waiting on waitpid() can be interrupted
> by the delivery of a signal to the calling process; though the result
> is -1, not just any < 0.
"< 0" is also used in wait_or_whine()
>> + while (getpgid(p->pid) >= 0 && tv.tv_sec - secs <
>> p->timeout) {
>> + gettimeofday(&tv, NULL);
>> + }
>
> WTF? Busy wait loop???
This was just a quick prototype to show "my thinking direction".
I wasn't expecting a review. Sorry :-)
> Also, if we want to wait for child without blocking, then instead
> of cryptic getpgid(p->pid) maybe use waitpid(p->pid, &status, WNOHANG);
> it is more explicit.
Sure!
> There is also another complication: there can be more than one
> long-running filter driver used. With this implementation we
> wait for each of one in sequence, e.g. 10s + 10s + 10s.
Good idea, I fixed that in the version below!
>>
>> -static void mark_child_for_cleanup(pid_t pid)
>> +static void mark_child_for_cleanup(pid_t pid, int timeout, int stdin)
>
> Hmmmm... three parameters is not too much, but we might want to
> pass "struct child_process *" directly if this number grows.
I used parameters because this function is also used with the async struct...
I've attached a more serious patch for review below.
What do you think?
Thanks,
Lars
diff --git a/run-command.c b/run-command.c
index 3269362..ca0feef 100644
--- a/run-command.c
+++ b/run-command.c
@@ -21,6 +21,9 @@ void child_process_clear(struct child_process *child)
struct child_to_clean {
pid_t pid;
+ char *name;
+ int stdin;
+ int timeout;
struct child_to_clean *next;
};
static struct child_to_clean *children_to_clean;
@@ -28,12 +31,53 @@ static int installed_child_cleanup_handler;
static void cleanup_children(int sig, int in_signal)
{
+ int status;
+ struct timeval tv;
+ time_t secs;
+ struct child_to_clean *p = children_to_clean;
+
+ // Send EOF to children as indicator that Git will exit soon
+ while (p) {
+ if (p->timeout != 0) {
+ if (p->stdin > 0)
+ close(p->stdin);
+ }
+ p = p->next;
+ }
+
while (children_to_clean) {
- struct child_to_clean *p = children_to_clean;
+ p = children_to_clean;
children_to_clean = p->next;
+
+ if (p->timeout != 0) {
+ fprintf(stderr, _("Waiting for '%s' to finish..."),
p->name);
+ if (p->timeout < 0) {
+ // No timeout given - wait indefinitely
+ while ((waitpid(p->pid, &status, 0)) < 0 &&
errno == EINTR)
+ ; /* nothing */
+ } else {
+ // Wait until timeout
+ gettimeofday(&tv, NULL);
+ secs = tv.tv_sec;
+ while (!waitpid(p->pid, &status, WNOHANG) &&
+ tv.tv_sec - secs < p->timeout) {
+ fprintf(stderr, _(" \rWaiting %lds for
'%s' to finish..."),
+ p->timeout - tv.tv_sec + secs -
1, p->name);
+ gettimeofday(&tv, NULL);
+ sleep_millisec(10);
+ }
+ }
+ if (waitpid(p->pid, &status, WNOHANG))
+ fprintf(stderr, _("done!\n"));
+ else
+ fprintf(stderr, _("timeout. Killing...\n"));
+ }
+
kill(p->pid, sig);
- if (!in_signal)
+ if (!in_signal) {
+ free(p->name);
free(p);
+ }
}
}
@@ -49,10 +93,18 @@ static void cleanup_children_on_exit(void)
cleanup_children(SIGTERM, 0);
}
-static void mark_child_for_cleanup(pid_t pid)
+static void mark_child_for_cleanup_with_timeout(pid_t pid, const char *name,
int stdin, int timeout)
{
struct child_to_clean *p = xmalloc(sizeof(*p));
p->pid = pid;
+ p->timeout = timeout;
+ p->stdin = stdin;
+ if (name) {
+ p->name = xmalloc(strlen(name) + 1);
+ strcpy(p->name, name);
+ } else {
+ p->name = "process";
+ }
p->next = children_to_clean;
children_to_clean = p;
@@ -63,6 +115,13 @@ static void mark_child_for_cleanup(pid_t pid)
}
}
+#ifdef NO_PTHREADS
+static void mark_child_for_cleanup(pid_t pid, const char *name, int timeout,
int stdin)
+{
+ mark_child_for_cleanup_with_timeout(pid, NULL, 0, 0);
+}
+#endif
+
static void clear_child_for_cleanup(pid_t pid)
{
struct child_to_clean **pp;
@@ -422,7 +481,8 @@ int start_command(struct child_process *cmd)
if (cmd->pid < 0)
error_errno("cannot fork() for %s", cmd->argv[0]);
else if (cmd->clean_on_exit)
- mark_child_for_cleanup(cmd->pid);
+ mark_child_for_cleanup_with_timeout(
+ cmd->pid, cmd->argv[0], cmd->in,
cmd->clean_on_exit_timeout);
/*
* Wait for child's execvp. If the execvp succeeds (or if fork()
@@ -483,7 +543,8 @@ int start_command(struct child_process *cmd)
if (cmd->pid < 0 && (!cmd->silent_exec_failure || errno != ENOENT))
error_errno("cannot spawn %s", cmd->argv[0]);
if (cmd->clean_on_exit && cmd->pid >= 0)
- mark_child_for_cleanup(cmd->pid);
+ mark_child_for_cleanup_with_timeout(
+ cmd->pid, cmd->argv[0], cmd->in,
cmd->clean_on_exit_timeout);
argv_array_clear(&nargv);
cmd->argv = sargv;
diff --git a/run-command.h b/run-command.h
index cf29a31..4c1c1f4 100644
--- a/run-command.h
+++ b/run-command.h
@@ -43,6 +43,16 @@ struct child_process {
unsigned stdout_to_stderr:1;
unsigned use_shell:1;
unsigned clean_on_exit:1;
+ /*
+ * clean_on_exit_timeout is only considered if clean_on_exit is set.
+ * - Specify 0 to kill the child on Git exit (default)
+ * - Specify a negative value to close the child's stdin on Git exit
+ * and wait indefinitely for the child's termination.
+ * - Specify a positive value to close the child's stdin on Git exit
+ * and wait clean_on_exit_timeout seconds for the child's
+ * termination.
+ */
+ int clean_on_exit_timeout;
};
#define CHILD_PROCESS_INIT { NULL, ARGV_ARRAY_INIT, ARGV_ARRAY_INIT }