Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Felipe Contreras
On Sat, Jun 1, 2013 at 9:03 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 8:51 PM, Felipe Contreras
 felipe.contre...@gmail.com wrote:
 Nobody is checking for specific error codes; it's the errno that's
 important.

 Have you just disregarded the in-code comment you just removed with
 one statement?

Who cares about the comment? As I said, nobody is checking for those codes.

 Did you check all its callers?

Yes, that's why I said nobody is checking for those codes.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Duy Nguyen
On Sat, Jun 1, 2013 at 9:06 PM, Felipe Contreras
felipe.contre...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 9:03 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 8:51 PM, Felipe Contreras
 felipe.contre...@gmail.com wrote:
 Nobody is checking for specific error codes; it's the errno that's
 important.

 Have you just disregarded the in-code comment you just removed with
 one statement?

 Who cares about the comment? As I said, nobody is checking for those codes.

Apparently I do.

 Did you check all its callers?

 Yes, that's why I said nobody is checking for those codes.

Thanks. If would be a few mails less if you stated so in the original message.
--
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Thomas Rast
Felipe Contreras felipe.contre...@gmail.com writes:

 Nobody is checking for specific error codes; it's the errno that's
 important.
[...]
 - /*
 -  * This return value is chosen so that code  0xff
 -  * mimics the exit code that a POSIX shell would report for
 -  * a program that died from this signal.
 -  */
 - code += 128;

Have you checked the callers?  There are lots of callers of
finish_command(), which returns the value from wait_or_whine()
unmodified.

-- 
Thomas Rast
trast@{inf,student}.ethz.ch
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Felipe Contreras
On Sat, Jun 1, 2013 at 9:08 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 9:06 PM, Felipe Contreras
 felipe.contre...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 9:03 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 8:51 PM, Felipe Contreras
 felipe.contre...@gmail.com wrote:
 Nobody is checking for specific error codes; it's the errno that's
 important.

 Have you just disregarded the in-code comment you just removed with
 one statement?

 Who cares about the comment? As I said, nobody is checking for those codes.

 Apparently I do.

Why do you care about code comments that have no relation to reality?

 Did you check all its callers?

 Yes, that's why I said nobody is checking for those codes.

 Thanks. If would be a few mails less if you stated so in the original message.

So why did you think I said that that nobody is checking for those codes?

Anyway, apparently somebody added code that checks for the specific
code since I wrote this patch:

1250857 launch_editor: propagate signals from editor to git

To my knowledge this is the only place where the specific number us
actually checked.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Felipe Contreras
On Sat, Jun 1, 2013 at 9:19 AM, Thomas Rast tr...@inf.ethz.ch wrote:
 Felipe Contreras felipe.contre...@gmail.com writes:

 Nobody is checking for specific error codes; it's the errno that's
 important.
 [...]
 - /*
 -  * This return value is chosen so that code  0xff
 -  * mimics the exit code that a POSIX shell would report for
 -  * a program that died from this signal.
 -  */
 - code += 128;

 Have you checked the callers?  There are lots of callers of
 finish_command(), which returns the value from wait_or_whine()
 unmodified.

Yes I did. Most of them simply check that the number is not zero.

However, that was at the time I wrote the patch, and it seems there's
now one instance where the code is checked.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Felipe Contreras
On Sat, Jun 1, 2013 at 9:21 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 8:51 PM, Felipe Contreras
 felipe.contre...@gmail.com wrote:
 Nobody is checking for specific error codes; it's the errno that's
 important.

 Signed-off-by: Felipe Contreras felipe.contre...@gmail.com
 ---
  run-command.c | 14 ++
  1 file changed, 2 insertions(+), 12 deletions(-)

 diff --git a/run-command.c b/run-command.c
 index 1b32a12..e54e943 100644
 --- a/run-command.c
 +++ b/run-command.c
 @@ -244,21 +244,11 @@ static int wait_or_whine(pid_t pid, const char *argv0)
 code = WTERMSIG(status);
 if (code != SIGINT  code != SIGQUIT)
 error(%s died of signal %d, argv0, code);
 -   /*
 -* This return value is chosen so that code  0xff
 -* mimics the exit code that a POSIX shell would report for
 -* a program that died from this signal.
 -*/
 -   code += 128;
 } else if (WIFEXITED(status)) {

 The original commit that introduces this says

 run_command: encode deadly signal number in the return value

 We now write the signal number in the error message if the program
 terminated by a signal. The negative return value is constructed such that
 after truncation to 8 bits it looks like a POSIX shell's $?:

$ echo  | { git upload-pack .; echo $? 2; } | :
error: git-upload-pack died of signal 13
141

 Previously, the exit code was 255 instead of 141.

 So this is part of the interface to the user. With your changes, the
 exit code is now different. I tested by force segfaulting upload-pack.
 $? returned 11. So NAK.

Yeah, and last year we returned a different code. The world didn't
end, because nobody is checking for the specific code. But if you want
to retain complexity forever, suit yourselves.

commit 709ca730f8e093005cc882bfb86c0ca9c83d345b
Author: Jeff King p...@peff.net
Date:   Sat Jan 5 09:49:49 2013 -0500

run-command: encode signal death as a positive integer

When a sub-command dies due to a signal, we encode the
signal number into the numeric exit status as signal -
128. This is easy to identify (versus a regular positive
error code), and when cast to an unsigned integer (e.g., by
feeding it to exit), matches what a POSIX shell would return
when reporting a signal death in $? or through its own exit
code.

So we have a negative value inside the code, but once it
passes across an exit() barrier, it looks positive (and any
code we receive from a sub-shell will have the positive
form). E.g., death by SIGPIPE (signal 13) will look like
-115 to us in inside git, but will end up as 141 when we
call exit() with it. And a program killed by SIGPIPE but run
via the shell will come to us with an exit code of 141.

Unfortunately, this means that when the use_shell option
is set, we need to be on the lookout for _both_ forms. We
might or might not have actually invoked the shell (because
we optimize out some useless shell calls). If we didn't invoke
the shell, we will will see the sub-process's signal death
directly, and run-command converts it into a negative value.
But if we did invoke the shell, we will see the shell's
128+signal exit status. To be thorough, we would need to
check both, or cast the value to an unsigned char (after
checking that it is not -1, which is a magic error value).

Fortunately, most callsites do not care at all whether the
exit was from a code or from a signal; they merely check for
a non-zero status, and sometimes propagate the error via
exit(). But for the callers that do care, we can make life
slightly easier by just using the consistent positive form.

This actually fixes two minor bugs:

  1. In launch_editor, we check whether the editor died from
 SIGINT or SIGQUIT. But we checked only the negative
 form, meaning that we would fail to notice a signal
 death exit code which was propagated through the shell.

  2. In handle_alias, we assume that a negative return value
 from run_command means that errno tells us something
 interesting (like a fork failure, or ENOENT).
 Otherwise, we simply propagate the exit code. Negative
 signal death codes confuse us, and we print a useless
 unable to run alias 'foo': Success message. By
 encoding signal deaths using the positive form, the
 existing code just propagates it as it would a normal
 non-zero exit code.

The downside is that callers of run_command can no longer
differentiate between a signal received directly by the
sub-process, and one propagated. However, no caller
currently cares, and since we already optimize out some
calls to the shell under the hood, that distinction is not
something that 

Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Duy Nguyen
On Sat, Jun 1, 2013 at 9:30 PM, Felipe Contreras
felipe.contre...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 9:21 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 8:51 PM, Felipe Contreras
 felipe.contre...@gmail.com wrote:
 Yeah, and last year we returned a different code. The world didn't
 end, because nobody is checking for the specific code. But if you want
 to retain complexity forever, suit yourselves.

And that was changed for a reason, compared to this change because I
like it. I maintain my NAK (not that it matters) until you justify
your change better than nobody is using it.

 commit 709ca730f8e093005cc882bfb86c0ca9c83d345b
 Author: Jeff King p...@peff.net
 Date:   Sat Jan 5 09:49:49 2013 -0500

 run-command: encode signal death as a positive integer
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Felipe Contreras
On Sat, Jun 1, 2013 at 9:36 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 9:30 PM, Felipe Contreras
 felipe.contre...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 9:21 AM, Duy Nguyen pclo...@gmail.com wrote:
 On Sat, Jun 1, 2013 at 8:51 PM, Felipe Contreras
 felipe.contre...@gmail.com wrote:
 Yeah, and last year we returned a different code. The world didn't
 end, because nobody is checking for the specific code. But if you want
 to retain complexity forever, suit yourselves.

 And that was changed for a reason, compared to this change because I
 like it. I maintain my NAK (not that it matters) until you justify
 your change better than nobody is using it.

Who said the reason was because I like it? You don't agree that
making the code simpler and more maintainable is a good reason for any
change?

Anyway, if you care so much about the current behavior, why isn't
there any tests that check for this?

My patch passes *all* the tests.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Jeff King
On Sat, Jun 01, 2013 at 09:30:50AM -0500, Felipe Contreras wrote:

  The original commit that introduces this says
 
  run_command: encode deadly signal number in the return value
 
  We now write the signal number in the error message if the program
  terminated by a signal. The negative return value is constructed such 
  that
  after truncation to 8 bits it looks like a POSIX shell's $?:
 
 $ echo  | { git upload-pack .; echo $? 2; } | :
 error: git-upload-pack died of signal 13
 141
 
  Previously, the exit code was 255 instead of 141.
 
  So this is part of the interface to the user. With your changes, the
  exit code is now different. I tested by force segfaulting upload-pack.
  $? returned 11. So NAK.
 
 Yeah, and last year we returned a different code. The world didn't
 end, because nobody is checking for the specific code. But if you want
 to retain complexity forever, suit yourselves.

Last year we returned a different code from the function that other C
code saw. But what got returned via exit() to exterior programs was
always 141 in the SIGPIPE case, both before and after my 709ca730. That
is explained in the first two paragraphs here:

 commit 709ca730f8e093005cc882bfb86c0ca9c83d345b
 Author: Jeff King p...@peff.net
 Date:   Sat Jan 5 09:49:49 2013 -0500
 
 run-command: encode signal death as a positive integer
 
 When a sub-command dies due to a signal, we encode the
 signal number into the numeric exit status as signal -
 128. This is easy to identify (versus a regular positive
 error code), and when cast to an unsigned integer (e.g., by
 feeding it to exit), matches what a POSIX shell would return
 when reporting a signal death in $? or through its own exit
 code.
 
 So we have a negative value inside the code, but once it
 passes across an exit() barrier, it looks positive (and any
 code we receive from a sub-shell will have the positive
 form). E.g., death by SIGPIPE (signal 13) will look like
 -115 to us in inside git, but will end up as 141 when we
 call exit() with it. And a program killed by SIGPIPE but run
 via the shell will come to us with an exit code of 141.

Your patch changes the error code that is propagated via exit() in this
case. We cannot know nobody is checking for the specific code, because
the list of callers is every shell script or program which execs git.
Some of them do care about the exit code. I can give an example of a
case I have that cares, but I do not think it is even important. The
point is that we would be regressing an existing interface, and cannot
know who is broken by it.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] run-command: simplify wait_or_whine

2013-06-01 Thread Felipe Contreras
On Sat, Jun 1, 2013 at 12:01 PM, Jeff King p...@peff.net wrote:
 On Sat, Jun 01, 2013 at 09:30:50AM -0500, Felipe Contreras wrote:

 commit 709ca730f8e093005cc882bfb86c0ca9c83d345b
 Author: Jeff King p...@peff.net
 Date:   Sat Jan 5 09:49:49 2013 -0500

 run-command: encode signal death as a positive integer

 When a sub-command dies due to a signal, we encode the
 signal number into the numeric exit status as signal -
 128. This is easy to identify (versus a regular positive
 error code), and when cast to an unsigned integer (e.g., by
 feeding it to exit), matches what a POSIX shell would return
 when reporting a signal death in $? or through its own exit
 code.

 So we have a negative value inside the code, but once it
 passes across an exit() barrier, it looks positive (and any
 code we receive from a sub-shell will have the positive
 form). E.g., death by SIGPIPE (signal 13) will look like
 -115 to us in inside git, but will end up as 141 when we
 call exit() with it. And a program killed by SIGPIPE but run
 via the shell will come to us with an exit code of 141.

 Your patch changes the error code that is propagated via exit() in this
 case. We cannot know nobody is checking for the specific code, because
 the list of callers is every shell script or program which execs git.
 Some of them do care about the exit code. I can give an example of a
 case I have that cares, but I do not think it is even important. The
 point is that we would be regressing an existing interface, and cannot
 know who is broken by it.

Of course we can know, by going forward with the change, and quite
often that's the only way to know for sure.

But if it's true what you said that we haven't changed what the
process returns, then it doesn't make sense to attempt going forward,
because we have no reference point about the likelihood of scripts
relying on specific exit codes.

-- 
Felipe Contreras
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html