Re: [PATCH v8 00/11] Git filter protocol

2016-10-06 Thread Junio C Hamano
Jeff King  writes:

> I am not sure if I have followed all of this discussion, but I am of the
> opinion that Git should not be doing any timeouts at all.
> ...
> If this is debugging output of the form "now I am calling wait() on all
> of the filters", without respect to any timeout, that sounds reasonable.
> Though I would argue that run-command should simply trace_printf()
> any processes it is waiting for, which covers _any_ process, not just
> the filters. That seems like a good match for the rest of the GIT_TRACE
> output, which describes how and when we spawn the sub-processes.

Yup, I agree that having trace_printf() report the wait for any
process is the cleanest way to go.  As you guessed the reason why
Lars is bringing up "now we are waiting for this filter" is because
I suggested it as a way to encourage users to file bugs when they
see a hung Git.  Originally Lars wanted to have a timeout on wait
and after the timeout wanted to kill the process, and because I
really did not want such a random "you are too slow to die, so I'll
send a signal to you and exit myself without making sure you died"
there, I suggested that if we were to have a timeout, that would be
to timeout the wait only to have a chance to tell the user "we are
stuck waiting on this thing" (and then go back to wait), as it would
either be a buggy filter (i.e. the users need to debug their own
filter code) or a buggy use of wait on Git side (i.e. we would want
to hear about such bugs from them).

Without such a "wait with timeout so that we can tell the user", we
can still respond to "my 'git checkout' hangs forever" with "try
running with GIT_TRACE" as you outlined above, so I do not think we
need the timeout.

Thanks for straightening us out.




Re: [PATCH v8 00/11] Git filter protocol

2016-10-06 Thread Jeff King
On Thu, Oct 06, 2016 at 03:13:19PM +0200, Lars Schneider wrote:

> >>> Well, it would be good to tell users _why_ Git is hanging, see below.
> >> 
> >> Agreed. Do you think it is OK to write the message to stderr?
> > 
> > On the other hand, this is why GIT_TRACE (and GIT_TRACE_PERFORMANCE)
> > was invented for.  We do not signal troubles with single-shot filters,
> > so I guess doing it for multi-file filters is not needed.
> 
> I am on the fence with this one.
> 
> @Junio/Peff:
> Where would you prefer to see a "Waiting for filter 'XYZ'... " message?
> On stderr or via GIT_TRACE?

I am not sure if I have followed all of this discussion, but I am of the
opinion that Git should not be doing any timeouts at all. There are
simply too many places where the filter (or any other process that git
is depending on) could inexplicably hang, and I
do not want to pepper random timeouts for all parts of the procedure
where we say "woah, this is taking longer than expected" (nor do I want
to have a timeout for _one_ spot, and ignore all the others).

If this is debugging output of the form "now I am calling wait() on all
of the filters", without respect to any timeout, that sounds reasonable.
Though I would argue that run-command should simply trace_printf()
any processes it is waiting for, which covers _any_ process, not just
the filters. That seems like a good match for the rest of the GIT_TRACE
output, which describes how and when we spawn the sub-processes.

Something like:

diff --git a/run-command.c b/run-command.c
index 5a4dbb6..b884605 100644
--- a/run-command.c
+++ b/run-command.c
@@ -226,6 +226,9 @@ static int wait_or_whine(pid_t pid, const char *argv0, int 
in_signal)
pid_t waiting;
int failed_errno = 0;
 
+   if (!in_signal)
+   trace_printf("waiting for pid %d", (int)pid);
+
while ((waiting = waitpid(pid, , 0)) < 0 && errno == EINTR)
;   /* nothing */
if (in_signal)

but it does not have to be part of this series, I think.

-Peff


Re: [PATCH v8 00/11] Git filter protocol

2016-10-06 Thread Lars Schneider

> On 04 Oct 2016, at 21:04, Jakub Narębski  wrote:
> 
> W dniu 03.10.2016 o 19:13, Lars Schneider pisze: 
>>> On 01 Oct 2016, at 22:48, Jakub Narębski  wrote:
>>> W dniu 01.10.2016 o 20:59, Lars Schneider pisze: 
 On 29 Sep 2016, at 23:27, Junio C Hamano  wrote:
> Lars Schneider  writes:
> 
> If the filter process refuses to die forever when Git told it to
> shutdown (by closing the pipe to it, for example), that filter
> process is simply buggy.  I think we want users to become aware of
> that, instead of Git leaving it behind, which essentially is to
> sweep the problem under the rug.
>>> 
>>> Well, it would be good to tell users _why_ Git is hanging, see below.
>> 
>> Agreed. Do you think it is OK to write the message to stderr?
> 
> On the other hand, this is why GIT_TRACE (and GIT_TRACE_PERFORMANCE)
> was invented for.  We do not signal troubles with single-shot filters,
> so I guess doing it for multi-file filters is not needed.

I am on the fence with this one.

@Junio/Peff:
Where would you prefer to see a "Waiting for filter 'XYZ'... " message?
On stderr or via GIT_TRACE?


> 
> I agree with what Peff said elsewhere in the thread; if a filter
> process wants to take time to clean things up while letting Git
> proceed, it can do its own process management, but I think it is
> sensible for Git to wait the filter process it directly spawned.
 
 To realize the approach above I prototyped the run-command patch below:
 
 I added an "exit_timeout" variable to the "child_process" struct.
 On exit, Git will close the pipe to the process and wait "exit_timeout" 
 seconds until it kills the child process. If "exit_timeout" is negative
 then Git will wait until the process is done.
>>> 
>>> That might be good approach.  Probably the default would be to wait.
>> 
>> I think I would prefer a 2sec timeout or something as default. This way
>> we can ensure Git would not wait indefinitely for a buggy filter by default.
> 
> Actually this waiting for multi-file filter is only about waiting for
> the shutdown process of the filter.  The filter could still hang during
> processing a file, and git would hang too, if I understand it correctly.

Correct.


>> [...] this function is also used with the async struct... 
> 
> Hmmm... now I wonder if it is a good idea (similar treatment for
> single-file async-invoked filter, and multi-file pkt-line filters).
> 
> For single-file one-shot filter (correct me if I am wrong):
> 
> - git sends contents to filter, signals end with EOF
>   (after process is started)
> - in an async process:
>   - process is started
>   - git reads contents from filter, until EOF
>   - if process did not end, it is killed
> 
> 
> For multi-process pkt-line based filter (simplified):
> 
> - process is started
> - handshake
> - for each file
>   - file is send to filter process over pkt-line,
> end signalled with flush packet
>   - git reads from filter from pkt-line, until flush
> - ...
> 
> 
> See how single-shot filter is sent EOF, though in different part
> of code.  We need to signal multi-file filter that no more files
> will be coming.  Simplest solution is to send EOF (we could send
> "command=shutdown" for example...) to filter, and wait for EOF
> from filter (or for "status=finished" and EOF).

That's what we do. EOF does signal the multi-filter to shutdown.


> For full patch, you would need also to add to Documentation/config.txt

Why config.txt?


Thanks,
Lars

Re: [PATCH v8 00/11] Git filter protocol

2016-10-04 Thread Jakub Narębski
W dniu 03.10.2016 o 19:13, Lars Schneider pisze: 
>> On 01 Oct 2016, at 22:48, Jakub Narębski  wrote:
>> W dniu 01.10.2016 o 20:59, Lars Schneider pisze: 
>>> On 29 Sep 2016, at 23:27, Junio C Hamano  wrote:
 Lars Schneider  writes:

 If the filter process refuses to die forever when Git told it to
 shutdown (by closing the pipe to it, for example), that filter
 process is simply buggy.  I think we want users to become aware of
 that, instead of Git leaving it behind, which essentially is to
 sweep the problem under the rug.
>>
>> Well, it would be good to tell users _why_ Git is hanging, see below.
> 
> Agreed. Do you think it is OK to write the message to stderr?

On the other hand, this is why GIT_TRACE (and GIT_TRACE_PERFORMANCE)
was invented for.  We do not signal troubles with single-shot filters,
so I guess doing it for multi-file filters is not needed.
 
 I agree with what Peff said elsewhere in the thread; if a filter
 process wants to take time to clean things up while letting Git
 proceed, it can do its own process management, but I think it is
 sensible for Git to wait the filter process it directly spawned.
>>>
>>> To realize the approach above I prototyped the run-command patch below:
>>>
>>> I added an "exit_timeout" variable to the "child_process" struct.
>>> On exit, Git will close the pipe to the process and wait "exit_timeout" 
>>> seconds until it kills the child process. If "exit_timeout" is negative
>>> then Git will wait until the process is done.
>>
>> That might be good approach.  Probably the default would be to wait.
> 
> I think I would prefer a 2sec timeout or something as default. This way
> we can ensure Git would not wait indefinitely for a buggy filter by default.

Actually this waiting for multi-file filter is only about waiting for
the shutdown process of the filter.  The filter could still hang during
processing a file, and git would hang too, if I understand it correctly.

[...]
>> Also, how would one set default value of timeout for all process
>> based filters?
> 
> I think we don't need that because a timeout is always specific
> to a filter (if the 2sec default is not sufficient).

All right (assuming that timeouts are good idea). 

>>>
>>> +   while ((waitpid(p->pid, , 0)) < 0 && errno == 
>>> EINTR)
>>> +   ;   /* nothing */
>>
>> Ah, this loop is here because waiting on waitpid() can be interrupted
>> by the delivery of a signal to the calling process; though the result
>> is -1, not just any < 0.
> 
> "< 0" is also used in wait_or_whine()

O.K. (though it doesn't necessary mean that it is correct, there
is another point for using "< 0").
 
[...]
>> There is also another complication: there can be more than one
>> long-running filter driver used.  With this implementation we
>> wait for each of one in sequence, e.g. 10s + 10s + 10s.
> 
> Good idea, I fixed that in the version below!
> 
[...]
> [...] this function is also used with the async struct... 

Hmmm... now I wonder if it is a good idea (similar treatment for
single-file async-invoked filter, and multi-file pkt-line filters).

For single-file one-shot filter (correct me if I am wrong):

 - git sends contents to filter, signals end with EOF
   (after process is started)
 - in an async process:
   - process is started
   - git reads contents from filter, until EOF
   - if process did not end, it is killed


For multi-process pkt-line based filter (simplified):

 - process is started
 - handshake
 - for each file
   - file is send to filter process over pkt-line,
 end signalled with flush packet
   - git reads from filter from pkt-line, until flush
 - ...


See how single-shot filter is sent EOF, though in different part
of code.  We need to signal multi-file filter that no more files
will be coming.  Simplest solution is to send EOF (we could send
"command=shutdown" for example...) to filter, and wait for EOF
from filter (or for "status=finished" and EOF).

We could kill multi-file filter after sending last file and
receiving full response... but I think single-shot filter gets
killed only because it allows for very simple filters, and reusing
existing commands as filters.

[...]
> diff --git a/run-command.c b/run-command.c
> index 3269362..ca0feef 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -21,6 +21,9 @@ void child_process_clear(struct child_process *child)
>  
>  struct child_to_clean {
>   pid_t pid;
> + char *name;

I guess it is here for output purposes?

Should we store full command here, or just name of ?

> + int stdin;

I guess the name `stdin` for file _descriptor_ is something
used in other parts of convert.c code, isn't it?

> + int timeout;

Hmmm... we assume that timeout is in seconds, not millis or other
value, isn't it.  timeout_sec would perhaps be unnecessarily long.

>   struct child_to_clean *next;
>  

Re: [PATCH v8 00/11] Git filter protocol

2016-10-04 Thread Junio C Hamano
Jeff King  writes:

> On Mon, Oct 03, 2016 at 10:02:14AM -0700, Junio C Hamano wrote:
>
>> The timeout would be good for you to give a message "filter process
>> running the script '%s' is not exiting; I am waiting for it".  The
>> user is still left with a hung Git, and can then see if that process
>> is hanging around.  If it is, then we found a buggy filter.  Or we
>> found a buggy Git.  Either needs to be fixed.  I do not think it
>> would help anybody by doing a kill(2) to sweep possible bugs under
>> the rug.
>
> I would argue that we should not even bother with such a timeout. This
> is an exceptional, buggy condition, and hanging is not at all restricted
> to this particular case. If git is hanging, then the right tools are
> "ps" or "strace" to figure out what is going on. I know that not all
> users are comfortable with those tools, but enough are in practice that
> the bugs get ironed out, without git having to carry a bunch of extra
> timing code that is essentially never exercised.

OK.


Re: [PATCH v8 00/11] Git filter protocol

2016-10-04 Thread Jeff King
On Mon, Oct 03, 2016 at 10:02:14AM -0700, Junio C Hamano wrote:

> The timeout would be good for you to give a message "filter process
> running the script '%s' is not exiting; I am waiting for it".  The
> user is still left with a hung Git, and can then see if that process
> is hanging around.  If it is, then we found a buggy filter.  Or we
> found a buggy Git.  Either needs to be fixed.  I do not think it
> would help anybody by doing a kill(2) to sweep possible bugs under
> the rug.

I would argue that we should not even bother with such a timeout. This
is an exceptional, buggy condition, and hanging is not at all restricted
to this particular case. If git is hanging, then the right tools are
"ps" or "strace" to figure out what is going on. I know that not all
users are comfortable with those tools, but enough are in practice that
the bugs get ironed out, without git having to carry a bunch of extra
timing code that is essentially never exercised.

-Peff


Re: [PATCH v8 00/11] Git filter protocol

2016-10-03 Thread Lars Schneider

> On 03 Oct 2016, at 19:02, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>>> If the filter process refuses to die forever when Git told it to
>>> shutdown (by closing the pipe to it, for example), that filter
>>> process is simply buggy.  I think we want users to become aware of
>>> that, instead of Git leaving it behind, which essentially is to
>>> sweep the problem under the rug.
>>> 
>>> I agree with what Peff said elsewhere in the thread; if a filter
>>> process wants to take time to clean things up while letting Git
>>> proceed, it can do its own process management, but I think it is
>>> sensible for Git to wait the filter process it directly spawned.
>> 
>> To realize the approach above I prototyped the run-command patch below:
>> 
>> I added an "exit_timeout" variable to the "child_process" struct.
>> On exit, Git will close the pipe to the process and wait "exit_timeout" 
>> seconds until it kills the child process. If "exit_timeout" is negative
>> then Git will wait until the process is done.
> 
>> If we use that in the long running filter process, then we could make
>> the timeout even configurable. E.g. with "filter..process-timeout".
>> 
>> What do you think about this solution? 
> 
> Is such a configuration (or timeout in general) necessary?  I
> suspect that a need for timeout, especially needing timeout and
> needing to get killed that happens so often to require a
> configuration variable, is a sign of something else seriously wrong.
> 
> What's the justification for a filter to _require_ getting killed
> all the time when it is spawned?  Otherwise you wouldn't configure
> "this driver does not die when told, so we need a timeout" variable.
> Is it a sign of the flaw in the protocol to talk to it?  e.g. Git
> has a way to tell it to die, but it somehow is very hard to hear
> from filter's end and honor that request?
> 
> I think that we would need some timeout in the mechanism, but not to
> be used for "killing".
> 
> You would decide to "kill" an filter process in two cases: the
> filter is buggy and refuses to die when Git tells it to exit, or the
> code in Git waiting for its death is somehow miscounting its
> children, and thought it told to die one process but in fact it
> didn't (perhaps it told somebody else to die), or it thought it
> hasn't seen the child die when in fact it already did.

Agreed.


> Calling kill(2) and exiting would hide these two kind of bugs from
> end users.  Not doing so would give the end users a hung Git, which
> is a VERY GOOD thing.  Otherwise you would not notice bugs and lose
> the opportunity to diagnose and fix it.

Aha. I assumed that a hung Git because of a buggy filter would be a no-no.
Thanks for this clarification.


> The timeout would be good for you to give a message "filter process
> running the script '%s' is not exiting; I am waiting for it".  The
> user is still left with a hung Git, and can then see if that process
> is hanging around.  If it is, then we found a buggy filter.  Or we
> found a buggy Git.  Either needs to be fixed.  I do not think it
> would help anybody by doing a kill(2) to sweep possible bugs under
> the rug.

I could achieve that with this run-command patch: 
http://public-inbox.org/git/e9946e9f-6ee5-492b-b122-9078ceb88...@gmail.com/
(I'll remove the "timeout after x seconds" parts and keep the "wait until 
done" part with stderr output)


Thanks,
Lars

Re: [PATCH v8 00/11] Git filter protocol

2016-10-03 Thread Lars Schneider

> On 01 Oct 2016, at 22:48, Jakub Narębski  wrote:
> 
> W dniu 01.10.2016 o 20:59, Lars Schneider pisze: 
>> On 29 Sep 2016, at 23:27, Junio C Hamano  wrote:
>>> Lars Schneider  writes:
>>> 
>>> If the filter process refuses to die forever when Git told it to
>>> shutdown (by closing the pipe to it, for example), that filter
>>> process is simply buggy.  I think we want users to become aware of
>>> that, instead of Git leaving it behind, which essentially is to
>>> sweep the problem under the rug.
> 
> Well, it would be good to tell users _why_ Git is hanging, see below.

Agreed. Do you think it is OK to write the message to stderr?


>>> I agree with what Peff said elsewhere in the thread; if a filter
>>> process wants to take time to clean things up while letting Git
>>> proceed, it can do its own process management, but I think it is
>>> sensible for Git to wait the filter process it directly spawned.
>> 
>> To realize the approach above I prototyped the run-command patch below:
>> 
>> I added an "exit_timeout" variable to the "child_process" struct.
>> On exit, Git will close the pipe to the process and wait "exit_timeout" 
>> seconds until it kills the child process. If "exit_timeout" is negative
>> then Git will wait until the process is done.
> 
> That might be good approach.  Probably the default would be to wait.

I think I would prefer a 2sec timeout or something as default. This way
we can ensure Git would not wait indefinitely for a buggy filter by default.


>> If we use that in the long running filter process, then we could make
>> the timeout even configurable. E.g. with "filter..process-timeout".
> 
> Sidenote: we prefer camelCase rather than kebab-case for config
> variables, that is, "filter..processTimeout".

Sure!


> Also, how would one set default value of timeout for all process
> based filters?

I think we don't need that because a timeout is always specific
to a filter (if the 2sec default is not sufficient).


>> 
>> +while ((waitpid(p->pid, , 0)) < 0 && errno == 
>> EINTR)
>> +;   /* nothing */
> 
> Ah, this loop is here because waiting on waitpid() can be interrupted
> by the delivery of a signal to the calling process; though the result
> is -1, not just any < 0.

"< 0" is also used in wait_or_whine()


>> +while (getpgid(p->pid) >= 0 && tv.tv_sec - secs < 
>> p->timeout) {
>> +gettimeofday(, NULL);
>> +}
> 
> WTF?  Busy wait loop???

This was just a quick prototype to show "my thinking direction". 
I wasn't expecting a review. Sorry :-)


> Also, if we want to wait for child without blocking, then instead
> of cryptic getpgid(p->pid) maybe use waitpid(p->pid, , WNOHANG);
> it is more explicit.

Sure!


> There is also another complication: there can be more than one
> long-running filter driver used.  With this implementation we
> wait for each of one in sequence, e.g. 10s + 10s + 10s.

Good idea, I fixed that in the version below!


>> 
>> -static void mark_child_for_cleanup(pid_t pid)
>> +static void mark_child_for_cleanup(pid_t pid, int timeout, int stdin)
> 
> H... three parameters is not too much, but we might want to
> pass "struct child_process *" directly if this number grows.

I used parameters because this function is also used with the async struct... 

I've attached a more serious patch for review below.
What do you think?

Thanks,
Lars



diff --git a/run-command.c b/run-command.c
index 3269362..ca0feef 100644
--- a/run-command.c
+++ b/run-command.c
@@ -21,6 +21,9 @@ void child_process_clear(struct child_process *child)
 
 struct child_to_clean {
pid_t pid;
+   char *name;
+   int stdin;
+   int timeout;
struct child_to_clean *next;
 };
 static struct child_to_clean *children_to_clean;
@@ -28,12 +31,53 @@ static int installed_child_cleanup_handler;
 
 static void cleanup_children(int sig, int in_signal)
 {
+   int status;
+   struct timeval tv;
+   time_t secs;
+   struct child_to_clean *p = children_to_clean;
+
+   // Send EOF to children as indicator that Git will exit soon
+   while (p) {
+   if (p->timeout != 0) {
+   if (p->stdin > 0)
+   close(p->stdin);
+   }
+   p = p->next;
+   }
+
while (children_to_clean) {
-   struct child_to_clean *p = children_to_clean;
+   p = children_to_clean;
children_to_clean = p->next;
+
+   if (p->timeout != 0) {
+   fprintf(stderr, _("Waiting for '%s' to finish..."), 
p->name);
+   if (p->timeout < 0) {
+   // No timeout given - wait indefinitely
+   while ((waitpid(p->pid, , 0)) < 0 && 
errno == EINTR)
+   ;   

Re: [PATCH v8 00/11] Git filter protocol

2016-10-03 Thread Junio C Hamano
Lars Schneider  writes:

>> If the filter process refuses to die forever when Git told it to
>> shutdown (by closing the pipe to it, for example), that filter
>> process is simply buggy.  I think we want users to become aware of
>> that, instead of Git leaving it behind, which essentially is to
>> sweep the problem under the rug.
>> 
>> I agree with what Peff said elsewhere in the thread; if a filter
>> process wants to take time to clean things up while letting Git
>> proceed, it can do its own process management, but I think it is
>> sensible for Git to wait the filter process it directly spawned.
>
> To realize the approach above I prototyped the run-command patch below:
>
> I added an "exit_timeout" variable to the "child_process" struct.
> On exit, Git will close the pipe to the process and wait "exit_timeout" 
> seconds until it kills the child process. If "exit_timeout" is negative
> then Git will wait until the process is done.

> If we use that in the long running filter process, then we could make
> the timeout even configurable. E.g. with "filter..process-timeout".
>
> What do you think about this solution? 

Is such a configuration (or timeout in general) necessary?  I
suspect that a need for timeout, especially needing timeout and
needing to get killed that happens so often to require a
configuration variable, is a sign of something else seriously wrong.

What's the justification for a filter to _require_ getting killed
all the time when it is spawned?  Otherwise you wouldn't configure
"this driver does not die when told, so we need a timeout" variable.
Is it a sign of the flaw in the protocol to talk to it?  e.g. Git
has a way to tell it to die, but it somehow is very hard to hear
from filter's end and honor that request?

I think that we would need some timeout in the mechanism, but not to
be used for "killing".

You would decide to "kill" an filter process in two cases: the
filter is buggy and refuses to die when Git tells it to exit, or the
code in Git waiting for its death is somehow miscounting its
children, and thought it told to die one process but in fact it
didn't (perhaps it told somebody else to die), or it thought it
hasn't seen the child die when in fact it already did.

Calling kill(2) and exiting would hide these two kind of bugs from
end users.  Not doing so would give the end users a hung Git, which
is a VERY GOOD thing.  Otherwise you would not notice bugs and lose
the opportunity to diagnose and fix it.

The timeout would be good for you to give a message "filter process
running the script '%s' is not exiting; I am waiting for it".  The
user is still left with a hung Git, and can then see if that process
is hanging around.  If it is, then we found a buggy filter.  Or we
found a buggy Git.  Either needs to be fixed.  I do not think it
would help anybody by doing a kill(2) to sweep possible bugs under
the rug.

Thanks.


Re: [PATCH v8 00/11] Git filter protocol

2016-10-01 Thread Jakub Narębski
W dniu 01.10.2016 o 20:59, Lars Schneider pisze: 
> On 29 Sep 2016, at 23:27, Junio C Hamano  wrote:
>> Lars Schneider  writes:
>>
>>> We discussed that issue in v4 and v6:
>>> http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/
>>> http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/
>>>
>>> My impression was that you don't want Git to wait for the filter process.
>>> If Git waits for the filter process - how long should Git wait?
>>
>> I am not sure where you got that impression.  I did say that I do
>> not want Git to _KILL_ my filter process.  That does not mean I want
>> Git to go away without waiting for me.
>>
>> If the filter process refuses to die forever when Git told it to
>> shutdown (by closing the pipe to it, for example), that filter
>> process is simply buggy.  I think we want users to become aware of
>> that, instead of Git leaving it behind, which essentially is to
>> sweep the problem under the rug.

Well, it would be good to tell users _why_ Git is hanging, see below.

>>
>> I agree with what Peff said elsewhere in the thread; if a filter
>> process wants to take time to clean things up while letting Git
>> proceed, it can do its own process management, but I think it is
>> sensible for Git to wait the filter process it directly spawned.
> 
> To realize the approach above I prototyped the run-command patch below:
> 
> I added an "exit_timeout" variable to the "child_process" struct.
> On exit, Git will close the pipe to the process and wait "exit_timeout" 
> seconds until it kills the child process. If "exit_timeout" is negative
> then Git will wait until the process is done.

That might be good approach.  Probably the default would be to wait.

> 
> If we use that in the long running filter process, then we could make
> the timeout even configurable. E.g. with "filter..process-timeout".

Sidenote: we prefer camelCase rather than kebab-case for config
variables, that is, "filter..processTimeout".

Also, how would one set default value of timeout for all process
based filters?

> 
> What do you think about this solution?

I think this addition be done after, assuming that we come up
with good default behavior (e.g. wait for filter processes
to finish).

Also, we would probably want to add some progress information
in a similar way to progress info for checkout, that is display
it after a few seconds waiting.

This could be, for example:

  Waiting for filter '' to finish... done

With timeout it could look like this (where underlined part
is interactive, that is changes every second):

  Waiting 10s for '' filter process to finish.
  ^^^

And then either

  Filter '' killed

or

  Filter '' finished


> diff --git a/run-command.c b/run-command.c
> index 3269362..a933066 100644
> --- a/run-command.c
> +++ b/run-command.c
> @@ -21,6 +21,8 @@ void child_process_clear(struct child_process *child)
>  
>  struct child_to_clean {
>   pid_t pid;
> + int stdin;
> + int timeout;
>   struct child_to_clean *next;
>  };
>  static struct child_to_clean *children_to_clean;
> @@ -28,9 +30,30 @@ static int installed_child_cleanup_handler;
>  
>  static void cleanup_children(int sig, int in_signal)
>  {
> + int status;
> + struct timeval tv;
> + time_t secs;
> +
>   while (children_to_clean) {
>   struct child_to_clean *p = children_to_clean;
>   children_to_clean = p->next;
> +
> + if (p->timeout != 0 && p->stdin > 0)
> + close(p->stdin);

Why are you not closing stdin of filter driver process if timeout
is zero?  Is it maybe some kind of special value?  If it is, this
is not documented.

> +
> + if (p->timeout < 0) {
> + // Wait until the process finishes
> + while ((waitpid(p->pid, , 0)) < 0 && errno == 
> EINTR)
> + ;   /* nothing */

Ah, this loop is here because waiting on waitpid() can be interrupted
by the delivery of a signal to the calling process; though the result
is -1, not just any < 0.

Looks good (but we would want some progress information, probably).

> + } else if (p->timeout != 0) {
> + // Wait until the process finishes or timeout
> + gettimeofday(, NULL);
> + secs = tv.tv_sec;
> + while (getpgid(p->pid) >= 0 && tv.tv_sec - secs < 
> p->timeout) {
> + gettimeofday(, NULL);
> + }

WTF?  Busy wait loop???

Also, if we want to wait for child without blocking, then instead
of cryptic getpgid(p->pid) maybe use waitpid(p->pid, , WNOHANG);
it is more explicit.

There is also another complication: there can be more than one
long-running filter driver used.  With this implementation we
wait for each of one in sequence, e.g. 10s + 10s + 10s.

> + }
> +
>   

Re: [PATCH v8 00/11] Git filter protocol

2016-10-01 Thread Lars Schneider

> On 29 Sep 2016, at 23:27, Junio C Hamano  wrote:
> 
> Lars Schneider  writes:
> 
>> We discussed that issue in v4 and v6:
>> http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/
>> http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/
>> 
>> My impression was that you don't want Git to wait for the filter process.
>> If Git waits for the filter process - how long should Git wait?
> 
> I am not sure where you got that impression.  I did say that I do
> not want Git to _KILL_ my filter process.  That does not mean I want
> Git to go away without waiting for me.
> 
> If the filter process refuses to die forever when Git told it to
> shutdown (by closing the pipe to it, for example), that filter
> process is simply buggy.  I think we want users to become aware of
> that, instead of Git leaving it behind, which essentially is to
> sweep the problem under the rug.
> 
> I agree with what Peff said elsewhere in the thread; if a filter
> process wants to take time to clean things up while letting Git
> proceed, it can do its own process management, but I think it is
> sensible for Git to wait the filter process it directly spawned.

To realize the approach above I prototyped the run-command patch below:

I added an "exit_timeout" variable to the "child_process" struct.
On exit, Git will close the pipe to the process and wait "exit_timeout" 
seconds until it kills the child process. If "exit_timeout" is negative
then Git will wait until the process is done.

If we use that in the long running filter process, then we could make
the timeout even configurable. E.g. with "filter..process-timeout".

What do you think about this solution? 

Thanks,
Lars



diff --git a/run-command.c b/run-command.c
index 3269362..a933066 100644
--- a/run-command.c
+++ b/run-command.c
@@ -21,6 +21,8 @@ void child_process_clear(struct child_process *child)
 
 struct child_to_clean {
pid_t pid;
+   int stdin;
+   int timeout;
struct child_to_clean *next;
 };
 static struct child_to_clean *children_to_clean;
@@ -28,9 +30,30 @@ static int installed_child_cleanup_handler;
 
 static void cleanup_children(int sig, int in_signal)
 {
+   int status;
+   struct timeval tv;
+   time_t secs;
+
while (children_to_clean) {
struct child_to_clean *p = children_to_clean;
children_to_clean = p->next;
+
+   if (p->timeout != 0 && p->stdin > 0)
+   close(p->stdin);
+
+   if (p->timeout < 0) {
+   // Wait until the process finishes
+   while ((waitpid(p->pid, , 0)) < 0 && errno == 
EINTR)
+   ;   /* nothing */
+   } else if (p->timeout != 0) {
+   // Wait until the process finishes or timeout
+   gettimeofday(, NULL);
+   secs = tv.tv_sec;
+   while (getpgid(p->pid) >= 0 && tv.tv_sec - secs < 
p->timeout) {
+   gettimeofday(, NULL);
+   }
+   }
+
kill(p->pid, sig);
if (!in_signal)
free(p);
@@ -49,10 +72,12 @@ static void cleanup_children_on_exit(void)
cleanup_children(SIGTERM, 0);
 }
 
-static void mark_child_for_cleanup(pid_t pid)
+static void mark_child_for_cleanup(pid_t pid, int timeout, int stdin)
 {
struct child_to_clean *p = xmalloc(sizeof(*p));
p->pid = pid;
+   p->stdin = stdin;
+   p->timeout = timeout;
p->next = children_to_clean;
children_to_clean = p;
 
@@ -422,7 +447,7 @@ int start_command(struct child_process *cmd)
if (cmd->pid < 0)
error_errno("cannot fork() for %s", cmd->argv[0]);
else if (cmd->clean_on_exit)
-   mark_child_for_cleanup(cmd->pid);
+   mark_child_for_cleanup(cmd->pid, cmd->exit_timeout, cmd->in);
 
/*
 * Wait for child's execvp. If the execvp succeeds (or if fork()
@@ -483,7 +508,7 @@ int start_command(struct child_process *cmd)
if (cmd->pid < 0 && (!cmd->silent_exec_failure || errno != ENOENT))
error_errno("cannot spawn %s", cmd->argv[0]);
if (cmd->clean_on_exit && cmd->pid >= 0)
-   mark_child_for_cleanup(cmd->pid);
+   mark_child_for_cleanup(cmd->pid, cmd->exit_timeout, cmd->in);
 
argv_array_clear();
cmd->argv = sargv;
@@ -765,7 +790,7 @@ int start_async(struct async *async)
exit(!!async->proc(proc_in, proc_out, async->data));
}
 
-   mark_child_for_cleanup(async->pid);
+   mark_child_for_cleanup(async->pid, 0, -1);
 
if (need_in)
close(fdin[0]);
diff --git a/run-command.h b/run-command.h
index cf29a31..f2eca33 100644
--- a/run-command.h
+++ b/run-command.h
@@ -33,6 +33,7 @@ struct 

Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Lars Schneider  writes:

> We discussed that issue in v4 and v6:
> http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/
> http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/
>
> My impression was that you don't want Git to wait for the filter process.
> If Git waits for the filter process - how long should Git wait?

I am not sure where you got that impression.  I did say that I do
not want Git to _KILL_ my filter process.  That does not mean I want
Git to go away without waiting for me.

If the filter process refuses to die forever when Git told it to
shutdown (by closing the pipe to it, for example), that filter
process is simply buggy.  I think we want users to become aware of
that, instead of Git leaving it behind, which essentially is to
sweep the problem under the rug.

I agree with what Peff said elsewhere in the thread; if a filter
process wants to take time to clean things up while letting Git
proceed, it can do its own process management, but I think it is
sensible for Git to wait the filter process it directly spawned.



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

> I don't necessarily agree, though, that the timing of filter-process
> cleanup needs to be part of the public interface. So in your list:
>
>> 3) Git waits until the filter process finishes.
>
> That seems simple and elegant, but I can think of reasons we might not
> want to wait (e.g., if the filter has to do some maintenance task and
> does not the user to have to wait).
>
> OTOH, we already face this in git, and we solve it by explicitly
> backgrounding the maintenance task (i.e., auto-gc). So one could argue
> that it is the responsibility of the filter process to manage its own
> processes. It certainly makes the interaction with git simpler.

Yup, that summarizes my thinking a lot better than I managed to do
in the previous message.



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Jakub Narębski  writes:

> Or even better: make filter driver write its pid to pidfile, and then
> "wait $(cat rot13-filter.pid)".  That's what we do in lib-git-daemon.sh
> (I think).

I am not sure if "wait"ing on a random process that is not a direct
child is a reasonable thing to do, but I like the direction.

Communicate with a pidfile and wait until "kill -0 $that_pid" fails,
or something like that, would be clean enough.





Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Lars Schneider  writes:

> A pragmatic approach:
>
> I could drop the "STOP" message that the filter writes to the log
> on exit and everything would work as is. We could argue that this 
> is OK because Git doesn't care anyways if the filter process has 
> stopped or not.

That would mean you can leave the process running while the test
framework tries to remove the trash directory when we are done,
creating the same bug J6t mentioned in the thread, no?



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Jakub Narębski
W dniu 29.09.2016 o 13:57, Torsten Bögershausen pisze: 
> On 29/09/16 12:28, Lars Schneider wrote:

>> This is what happens:
>>
>> 1) Git exits
>> 2) The filter process receives EOF and prints "STOP" to the log
>> 3) t0021 checks the content of the log
>>
>> Sometimes 3 happened before 2 which makes the test fail.
>> (Example: https://travis-ci.org/git/git/jobs/162660563 )
>>
>> I added a this to wait until the filter process terminates:
>>
>> +wait_for_filter_termination () {
>> +while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
>> 2>&1
>> +do
>> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
>> +sleep 1
>> +done
>> +}
>>
>> Does this look OK to you?
> Do we need the ps at all ?
> How about this:
> 
> +wait_for_filter_termination () {
> +while ! grep "STOP"  LOGFILENAME >/dev/null
> +do
> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
> +sleep 1
> +done
> +}

Or even better: make filter driver write its pid to pidfile, and then
"wait $(cat rot13-filter.pid)".  That's what we do in lib-git-daemon.sh
(I think).

If the problem is exit status of "wait" builtin, then filter driver
can remove its pidfile after writing "STOP", just before ending.

-- 
Jakub Narębski



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Lars Schneider

> On 29 Sep 2016, at 18:57, Junio C Hamano  wrote:
> 
> Torsten Bögershausen  writes:
> 
>>> 1) Git exits
>>> 2) The filter process receives EOF and prints "STOP" to the log
>>> 3) t0021 checks the content of the log
>>> 
>>> Sometimes 3 happened before 2 which makes the test fail.
>>> (Example: https://travis-ci.org/git/git/jobs/162660563 )
>>> 
>>> I added a this to wait until the filter process terminates:
>>> 
>>> +wait_for_filter_termination () {
>>> +   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
>>> 2>&1
>>> +   do
>>> +   echo "Waiting for /t0021/rot13-filter.pl to finish..."
>>> +   sleep 1
>>> +   done
>>> +}
>>> 
>>> Does this look OK to you?
>> Do we need the ps at all ?
>> How about this:
>> 
>> +wait_for_filter_termination () {
>> +while ! grep "STOP"  LOGFILENAME >/dev/null
>> +do
>> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
>> +sleep 1
>> +done
>> +}
> 
> Running "ps" and grepping for a command is not suitable for script
> to reliably tell things, so it is out of question.  Compared to
> that, your version looks slightly better, but what if the machinery
> that being tested, i.e. the part that drives the filter process, is
> buggy or becomes buggy and causes the filter process that writes
> "STOP" to die before it actually writes that string?
> 
> I have a feeling that the machinery being tested needs to be fixed
> so that the sequence is always be:
> 
>0) Git spawns the filter process, as it needs some contents to
>   be filtered.
> 
>1) Git did everything it needed to do and decides that is time
>   to go.
> 
>2) Filter process receives EOF and prints "STOP" to the log.
> 
>3) Git waits until the filter process finishes.
> 
>4) t0021, after Git finishes, checks the log.


A pragmatic approach:

I could drop the "STOP" message that the filter writes to the log
on exit and everything would work as is. We could argue that this 
is OK because Git doesn't care anyways if the filter process has 
stopped or not.

Would that be OK for everyone?

- Lars

Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Johannes Sixt

Am 29.09.2016 um 20:18 schrieb Torsten Bögershausen:

I would agree that  Git should not wait for the filter.
But does the test suite need to wait for the filter ?


We have fixed a test case on Windows recently where a process hung 
around too long (5babb5bd). So, yes, the test suite has to wait for the 
filter.


-- Hannes



Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Torsten Bögershausen



On 29/09/16 19:57, Lars Schneider wrote:

On 29 Sep 2016, at 18:57, Junio C Hamano  wrote:

Torsten Bögershausen  writes:


1) Git exits
2) The filter process receives EOF and prints "STOP" to the log
3) t0021 checks the content of the log

Sometimes 3 happened before 2 which makes the test fail.
(Example: https://travis-ci.org/git/git/jobs/162660563 )

I added a this to wait until the filter process terminates:

+wait_for_filter_termination () {
+   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
2>&1
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Does this look OK to you?

Do we need the ps at all ?
How about this:

+wait_for_filter_termination () {
+   while ! grep "STOP"  LOGFILENAME >/dev/null
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Running "ps" and grepping for a command is not suitable for script
to reliably tell things, so it is out of question.  Compared to
that, your version looks slightly better, but what if the machinery
that being tested, i.e. the part that drives the filter process, is
buggy or becomes buggy and causes the filter process that writes
"STOP" to die before it actually writes that string?

I have a feeling that the machinery being tested needs to be fixed
so that the sequence is always be:

0) Git spawns the filter process, as it needs some contents to
   be filtered.

1) Git did everything it needed to do and decides that is time
   to go.

2) Filter process receives EOF and prints "STOP" to the log.

3) Git waits until the filter process finishes.

4) t0021, after Git finishes, checks the log.

Repeated sleep combined with grep is probably just sweeping the real
problem under the rug.  Do we have enough information to do the
above?

An inspiration may be in the way we centrally clean all tempfiles
and lockfiles before exiting.  We have a central registry of these
files that need cleaning up and have a single atexit(3) handler to
clean them up.  Perhaps we need a registry that filter processes
spawned by the mechanism Lars introduces in this series, and have an
atexit(3) handler that closes the pipe to them (which signals the
filters that it is time for them to go) and wait(2) on them, or
something?  I do not think we want any kill(2) to be involved in
this clean-up procedure, but I do think we should wait(2) on what we
spawn, as long as these processes are meant to be shut down when the
main process of Git exits (this is different from things like
credential-cache daemon where they are expected to persist and meant
to serve multiple Git processes).

We discussed that issue in v4 and v6:
http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/
http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/

My impression was that you don't want Git to wait for the filter process.
If Git waits for the filter process - how long should Git wait?

Thanks,
Lars


Hm,
I would agree that  Git should not wait for the filter.
But does the test suite need to wait for the filter ?
May be, in this case we test the filter and Git, which is good.
Adding a 1 second delay, if, and only if, there is a racy condition,
is not that bad (or do we have better ways to check for a process to
be terminated ?)




Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 09:57:57AM -0700, Junio C Hamano wrote:

> > +wait_for_filter_termination () {
> > +   while ! grep "STOP"  LOGFILENAME >/dev/null
> > +   do
> > +   echo "Waiting for /t0021/rot13-filter.pl to finish..."
> > +   sleep 1
> > +   done
> > +}
> 
> Running "ps" and grepping for a command is not suitable for script
> to reliably tell things, so it is out of question.  Compared to
> that, your version looks slightly better, but what if the machinery
> that being tested, i.e. the part that drives the filter process, is
> buggy or becomes buggy and causes the filter process that writes
> "STOP" to die before it actually writes that string?

I'm of the opinion that any busy-waiting is a good sign that something
is suboptimal. The right solution here seems like it should be signaling
the test script via a descriptor.

I don't necessarily agree, though, that the timing of filter-process
cleanup needs to be part of the public interface. So in your list:

> 3) Git waits until the filter process finishes.

That seems simple and elegant, but I can think of reasons we might not
want to wait (e.g., if the filter has to do some maintenance task and
does not the user to have to wait).

OTOH, we already face this in git, and we solve it by explicitly
backgrounding the maintenance task (i.e., auto-gc). So one could argue
that it is the responsibility of the filter process to manage its own
processes. It certainly makes the interaction with git simpler.

-Peff


Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Lars Schneider

> On 29 Sep 2016, at 18:57, Junio C Hamano  wrote:
> 
> Torsten Bögershausen  writes:
> 
>>> 1) Git exits
>>> 2) The filter process receives EOF and prints "STOP" to the log
>>> 3) t0021 checks the content of the log
>>> 
>>> Sometimes 3 happened before 2 which makes the test fail.
>>> (Example: https://travis-ci.org/git/git/jobs/162660563 )
>>> 
>>> I added a this to wait until the filter process terminates:
>>> 
>>> +wait_for_filter_termination () {
>>> +   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
>>> 2>&1
>>> +   do
>>> +   echo "Waiting for /t0021/rot13-filter.pl to finish..."
>>> +   sleep 1
>>> +   done
>>> +}
>>> 
>>> Does this look OK to you?
>> Do we need the ps at all ?
>> How about this:
>> 
>> +wait_for_filter_termination () {
>> +while ! grep "STOP"  LOGFILENAME >/dev/null
>> +do
>> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
>> +sleep 1
>> +done
>> +}
> 
> Running "ps" and grepping for a command is not suitable for script
> to reliably tell things, so it is out of question.  Compared to
> that, your version looks slightly better, but what if the machinery
> that being tested, i.e. the part that drives the filter process, is
> buggy or becomes buggy and causes the filter process that writes
> "STOP" to die before it actually writes that string?
> 
> I have a feeling that the machinery being tested needs to be fixed
> so that the sequence is always be:
> 
>0) Git spawns the filter process, as it needs some contents to
>   be filtered.
> 
>1) Git did everything it needed to do and decides that is time
>   to go.
> 
>2) Filter process receives EOF and prints "STOP" to the log.
> 
>3) Git waits until the filter process finishes.
> 
>4) t0021, after Git finishes, checks the log.
> 
> Repeated sleep combined with grep is probably just sweeping the real
> problem under the rug.  Do we have enough information to do the
> above?
> 
> An inspiration may be in the way we centrally clean all tempfiles
> and lockfiles before exiting.  We have a central registry of these
> files that need cleaning up and have a single atexit(3) handler to
> clean them up.  Perhaps we need a registry that filter processes
> spawned by the mechanism Lars introduces in this series, and have an
> atexit(3) handler that closes the pipe to them (which signals the
> filters that it is time for them to go) and wait(2) on them, or
> something?  I do not think we want any kill(2) to be involved in
> this clean-up procedure, but I do think we should wait(2) on what we
> spawn, as long as these processes are meant to be shut down when the
> main process of Git exits (this is different from things like
> credential-cache daemon where they are expected to persist and meant
> to serve multiple Git processes).

We discussed that issue in v4 and v6:
http://public-inbox.org/git/20160803225313.pk3tfe5ovz4y3...@sigill.intra.peff.net/
http://public-inbox.org/git/xmqqbn0a3wy3@gitster.mtv.corp.google.com/

My impression was that you don't want Git to wait for the filter process.
If Git waits for the filter process - how long should Git wait?

Thanks,
Lars

Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Junio C Hamano
Torsten Bögershausen  writes:

>> 1) Git exits
>> 2) The filter process receives EOF and prints "STOP" to the log
>> 3) t0021 checks the content of the log
>>
>> Sometimes 3 happened before 2 which makes the test fail.
>> (Example: https://travis-ci.org/git/git/jobs/162660563 )
>>
>> I added a this to wait until the filter process terminates:
>>
>> +wait_for_filter_termination () {
>> +while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
>> 2>&1
>> +do
>> +echo "Waiting for /t0021/rot13-filter.pl to finish..."
>> +sleep 1
>> +done
>> +}
>>
>> Does this look OK to you?
> Do we need the ps at all ?
> How about this:
>
> +wait_for_filter_termination () {
> + while ! grep "STOP"  LOGFILENAME >/dev/null
> + do
> + echo "Waiting for /t0021/rot13-filter.pl to finish..."
> + sleep 1
> + done
> +}

Running "ps" and grepping for a command is not suitable for script
to reliably tell things, so it is out of question.  Compared to
that, your version looks slightly better, but what if the machinery
that being tested, i.e. the part that drives the filter process, is
buggy or becomes buggy and causes the filter process that writes
"STOP" to die before it actually writes that string?

I have a feeling that the machinery being tested needs to be fixed
so that the sequence is always be:

0) Git spawns the filter process, as it needs some contents to
   be filtered.

1) Git did everything it needed to do and decides that is time
   to go.

2) Filter process receives EOF and prints "STOP" to the log.

3) Git waits until the filter process finishes.

4) t0021, after Git finishes, checks the log.

Repeated sleep combined with grep is probably just sweeping the real
problem under the rug.  Do we have enough information to do the
above?

An inspiration may be in the way we centrally clean all tempfiles
and lockfiles before exiting.  We have a central registry of these
files that need cleaning up and have a single atexit(3) handler to
clean them up.  Perhaps we need a registry that filter processes
spawned by the mechanism Lars introduces in this series, and have an
atexit(3) handler that closes the pipe to them (which signals the
filters that it is time for them to go) and wait(2) on them, or
something?  I do not think we want any kill(2) to be involved in
this clean-up procedure, but I do think we should wait(2) on what we
spawn, as long as these processes are meant to be shut down when the
main process of Git exits (this is different from things like
credential-cache daemon where they are expected to persist and meant
to serve multiple Git processes).




Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Torsten Bögershausen



On 29/09/16 12:28, Lars Schneider wrote:

On 28 Sep 2016, at 23:49, Junio C Hamano  wrote:

I suspect that you are preparing a reroll already, but the one that
is sitting in 'pu' seems to be flaky in t/t0021 and I seem to see
occasional failures from it.

I didn't trace where the test goes wrong, but one easy mistake you
could make (I am not saying that is the reason of the failure) is to
assume your filter will not be called under certain condition (like
immediately after you checked out from the index to the working
tree), when the automated test goes fast enough and get you into a
"racy git" situation---the filter may be asked to filter the
contents from the working tree again to re-validate what's there is
still what is in the index.

Thanks for the heads-up!

This is what happens:

1) Git exits
2) The filter process receives EOF and prints "STOP" to the log
3) t0021 checks the content of the log

Sometimes 3 happened before 2 which makes the test fail.
(Example: https://travis-ci.org/git/git/jobs/162660563 )

I added a this to wait until the filter process terminates:

+wait_for_filter_termination () {
+   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
2>&1
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Does this look OK to you?

Do we need the ps at all ?
How about this:

+wait_for_filter_termination () {
+   while ! grep "STOP"  LOGFILENAME >/dev/null
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}




Re: [PATCH v8 00/11] Git filter protocol

2016-09-29 Thread Lars Schneider

> On 28 Sep 2016, at 23:49, Junio C Hamano  wrote:
> 
> I suspect that you are preparing a reroll already, but the one that
> is sitting in 'pu' seems to be flaky in t/t0021 and I seem to see
> occasional failures from it.
> 
> I didn't trace where the test goes wrong, but one easy mistake you
> could make (I am not saying that is the reason of the failure) is to
> assume your filter will not be called under certain condition (like
> immediately after you checked out from the index to the working
> tree), when the automated test goes fast enough and get you into a
> "racy git" situation---the filter may be asked to filter the
> contents from the working tree again to re-validate what's there is
> still what is in the index.

Thanks for the heads-up! 

This is what happens:

1) Git exits
2) The filter process receives EOF and prints "STOP" to the log
3) t0021 checks the content of the log

Sometimes 3 happened before 2 which makes the test fail.
(Example: https://travis-ci.org/git/git/jobs/162660563 )

I added a this to wait until the filter process terminates:

+wait_for_filter_termination () {
+   while ps | grep -v grep | grep -F "/t0021/rot13-filter.pl" >/dev/null 
2>&1
+   do
+   echo "Waiting for /t0021/rot13-filter.pl to finish..."
+   sleep 1
+   done
+}

Does this look OK to you?

- Lars


Re: [PATCH v8 00/11] Git filter protocol

2016-09-28 Thread Junio C Hamano
I suspect that you are preparing a reroll already, but the one that
is sitting in 'pu' seems to be flaky in t/t0021 and I seem to see
occasional failures from it.

I didn't trace where the test goes wrong, but one easy mistake you
could make (I am not saying that is the reason of the failure) is to
assume your filter will not be called under certain condition (like
immediately after you checked out from the index to the working
tree), when the automated test goes fast enough and get you into a
"racy git" situation---the filter may be asked to filter the
contents from the working tree again to re-validate what's there is
still what is in the index.