Re: git segfaults on older Solaris releases

2016-04-12 Thread Patrick Steinhardt
On Sat, Apr 09, 2016 at 01:39:05PM -0400, Jeff King wrote:
> On Sat, Apr 09, 2016 at 09:02:38AM +0200, Tom G. Christensen wrote:
> 
> > I've finished testing 2.8.1 and I found one more case where a null is being
> > printed and causing a segfault. This happens even on Solaris 8 and 9.
> > The failling test is t3200.63.
> 
> Oh good, this one wasn't me. :)
> 
> It's just a normal "oops, we feed NULL and nobody on glibc noticed
> because it silently replaced it with "(null)" case. I did find a few
> other oddities while fixing it, though. +cc Patrick, who worked in this
> area most recently.
> 
>   [1/3]: config: lower-case first word of error strings
>   [2/3]: git_config_set_multivar_in_file: all non-zero returns are errors
>   [3/3]: git_config_set_multivar_in_file: handle "unset" errors
> 
> I think we may want some additional improvements. While doing 1/3, I
> noticed that many of these error messages could stand to be marked for
> translation. As other people are already looking at mass-conversion,
> I stopped short of doing it here (and merely contented myself with
> throwing a conflict into their patches ;) ).
> 
> The other thing is that 2/3 notices the error return from the
> config-setting functions is weird. It's sometimes negative and sometimes
> positive. I fixed this caller, but I think it's possible for the
> negative values to leak into our exit codes:
> 
>   $ touch .git/config
>   $ git config foo.bar baz
>   error: could not lock config file .git/config: File exists
>   $ echo $?
>   255
> 
> I seem to recall some systems having trouble with negative error codes,
> so we may want to make that more consistent.
> 
> -Peff

Oh, yeah. Those patches look good to me, thanks for cleaning up
my error.

Patrick


signature.asc
Description: PGP signature


Re: git segfaults on older Solaris releases

2016-04-09 Thread Jeff King
On Sat, Apr 09, 2016 at 10:17:56PM +0200, Tom G. Christensen wrote:

> On 09/04/16 19:39, Jeff King wrote:
> 
> >   [1/3]: config: lower-case first word of error strings
> >   [2/3]: git_config_set_multivar_in_file: all non-zero returns are errors
> >   [3/3]: git_config_set_multivar_in_file: handle "unset" errors
> >
> 
> I applied them to 2.8.1 and ran the testsuite again on Solaris 8/x86 and the
> segfault is gone.

Thanks for testing. By the way, I ran the whole test suite with "--tee -v"
and grepped for "(null)", which does find this case on glibc systems. I
didn't see any other interesting cases (there _are_ mentions of
"(null)", but they are from our code, not glibc converting NULLs). Which
I guess is just corroborating your testing, since you would have seen
any bad cases as segfaults.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-09 Thread Tom G. Christensen

On 09/04/16 19:39, Jeff King wrote:


   [1/3]: config: lower-case first word of error strings
   [2/3]: git_config_set_multivar_in_file: all non-zero returns are errors
   [3/3]: git_config_set_multivar_in_file: handle "unset" errors



I applied them to 2.8.1 and ran the testsuite again on Solaris 8/x86 and 
the segfault is gone.


-tgc

--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-09 Thread Jeff King
On Sat, Apr 09, 2016 at 09:02:38AM +0200, Tom G. Christensen wrote:

> I've finished testing 2.8.1 and I found one more case where a null is being
> printed and causing a segfault. This happens even on Solaris 8 and 9.
> The failling test is t3200.63.

Oh good, this one wasn't me. :)

It's just a normal "oops, we feed NULL and nobody on glibc noticed
because it silently replaced it with "(null)" case. I did find a few
other oddities while fixing it, though. +cc Patrick, who worked in this
area most recently.

  [1/3]: config: lower-case first word of error strings
  [2/3]: git_config_set_multivar_in_file: all non-zero returns are errors
  [3/3]: git_config_set_multivar_in_file: handle "unset" errors

I think we may want some additional improvements. While doing 1/3, I
noticed that many of these error messages could stand to be marked for
translation. As other people are already looking at mass-conversion,
I stopped short of doing it here (and merely contented myself with
throwing a conflict into their patches ;) ).

The other thing is that 2/3 notices the error return from the
config-setting functions is weird. It's sometimes negative and sometimes
positive. I fixed this caller, but I think it's possible for the
negative values to leak into our exit codes:

  $ touch .git/config
  $ git config foo.bar baz
  error: could not lock config file .git/config: File exists
  $ echo $?
  255

I seem to recall some systems having trouble with negative error codes,
so we may want to make that more consistent.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-09 Thread Tom G. Christensen

On 07/04/16 22:19, Tom G. Christensen wrote:

On 07/04/16 20:50, Junio C Hamano wrote:

Junio C Hamano  writes:
So perhaps this is all we need to fix your box.

  setup.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)



I applied this patch to 2.8.0 and have completed a testsuite run on
Solaris 2.6/x86 with only a few unrelated problems.
I will continue testing on the other Solaris < 10 releases but I do not
expect any difference in the outcome.



I've finished testing 2.8.1 and I found one more case where a null is 
being printed and causing a segfault. This happens even on Solaris 8 and 9.

The failling test is t3200.63.

Here is the backtrace from a Solaris 8/SPARC machine:

(gdb) core trash directory.t3200-branch/core
[New LWP 1]
[Thread debugging using libthread_db enabled]
[New Thread 1 (LWP 1)]
Core was generated by `/export/home/tgc/buildpkg/git/src/git-2.8.1/git 
branch --unset-upstream'.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0xfecb32cc in strlen () from /usr/lib/libc.so.1
(gdb) bt
#0  0xfecb32cc in strlen () from /usr/lib/libc.so.1
#1  0xfed06508 in _doprnt () from /usr/lib/libc.so.1
#2  0xfed08690 in vfprintf () from /usr/lib/libc.so.1
#3  0x001487bc in vreportf (prefix=, err=, 
params=0xffbfe408) at usage.c:23
#4  0x0014881c in die_builtin (err=0x198f90 "Could not set '%s' to 
'%s'", params=0xffbfe408) at usage.c:35
#5  0x00148934 in die (err=0x198f90 "Could not set '%s' to '%s'") at 
usage.c:108
#6  0x000af1b0 in git_config_set_multivar_in_file (value=0x0, 
key=0x1ecca0 "branch.master.remote",
config_filename=, value_regex=, 
multi_replace=) at config.c:2226
#7  git_config_set_multivar_in_file (config_filename=0x0, key=0x1ecca0 
"branch.master.remote", value=0x0,

value_regex=0x0, multi_replace=1) at config.c:2220
#8  0x0003aa6c in cmd_branch (argc=0, argv=0xffbfec00, prefix=out>) at builtin/branch.c:793
#9  0x000255e8 in run_builtin (argv=0xffbfec00, argc=2, p=0x1c365c 
) at git.c:353

#10 handle_builtin (argc=2, argv=0xffbfec00) at git.c:540
#11 0x00168ecc in run_argv (argv=0xffbfeb30, argcp=0xffbfebdc) at git.c:594
#12 main (argc=2, av=) at git.c:701
(gdb)


-tgc
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-07 Thread Jeff King
On Thu, Apr 07, 2016 at 12:37:53PM -0700, Junio C Hamano wrote:

> -- >8 --
> Subject: setup.c: do not feed NULL to "%.*s" even with the precision 0
> 
> A recent update 75faa45a (replace trivial malloc + sprintf / strcpy
> calls with xstrfmt, 2015-09-24) rewrote
> 
>   prepare an empty buffer
>   if (len)
>   append the first len bytes of "prefix" to the buffer
>   append "path" to the buffer
> 
> that computed "path", optionally prefixed by "prefix", into
> 
>   xstrfmt("%.*s%s", len, prefix, path);
> 
> However, passing a NULL pointer to the printf(3) family of functions
> to format it with %s conversion, even with the precision 0, i.e.
> 
>   xstrfmt("%.*s", 0, NULL)
> 
> yields undefined results, at least on some platforms.  
> 
> Avoid this problem by substituting prefix with "" when len==0, as
> prefix can legally be NULL in that case.  This would mimick the
> intent of the original code better.
> 
> Reported-by: "Tom G. Christensen" 
> Helped-by: Jeff King 
> Signed-off-by: Junio C Hamano 
> ---

Nicely explained.

Acked-by: Jeff King 

Thanks.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-07 Thread Tom G. Christensen

On 07/04/16 20:50, Junio C Hamano wrote:

Junio C Hamano  writes:
So perhaps this is all we need to fix your box.

  setup.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)



Yes that seems to work very well.

I applied this patch to 2.8.0 and have completed a testsuite run on 
Solaris 2.6/x86 with only a few unrelated problems.
I will continue testing on the other Solaris < 10 releases but I do not 
expect any difference in the outcome.


-tgc
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-07 Thread Junio C Hamano
Jeff King  writes:

>> So perhaps this is all we need to fix your box.
>> 
>>  setup.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/setup.c b/setup.c
>> index 3439ec6..b6c8aab 100644
>> --- a/setup.c
>> +++ b/setup.c
>> @@ -103,7 +103,7 @@ char *prefix_path_gently(const char *prefix, int len,
>>  return NULL;
>>  }
>>  } else {
>> -sanitized = xstrfmt("%.*s%s", len, prefix, path);
>> +sanitized = xstrfmt("%.*s%s", len, prefix ? prefix : "", path);
>>  if (remaining_prefix)
>>  *remaining_prefix = len;
>>  if (normalize_path_copy_len(sanitized, sanitized, 
>> remaining_prefix)) {
>
> The original pre-75faa45ae0230b321bf72027b2274315d7e14e34 version
> checked "if (len)", but I think this should be equally right.

That's a good point.  The original had

-   sanitized = xmalloc(len + strlen(path) + 1);
-   if (len)
-   memcpy(sanitized, prefix, len);
-   strcpy(sanitized + len, path);
+   sanitized = xstrfmt("%.*s%s", len, prefix, path);

and for a brief moment I was wondering why the strlen() did not
barf, until I realized that was on path not prefix.

-- >8 --
Subject: setup.c: do not feed NULL to "%.*s" even with the precision 0

A recent update 75faa45a (replace trivial malloc + sprintf / strcpy
calls with xstrfmt, 2015-09-24) rewrote

prepare an empty buffer
if (len)
append the first len bytes of "prefix" to the buffer
append "path" to the buffer

that computed "path", optionally prefixed by "prefix", into

xstrfmt("%.*s%s", len, prefix, path);

However, passing a NULL pointer to the printf(3) family of functions
to format it with %s conversion, even with the precision 0, i.e.

xstrfmt("%.*s", 0, NULL)

yields undefined results, at least on some platforms.  

Avoid this problem by substituting prefix with "" when len==0, as
prefix can legally be NULL in that case.  This would mimick the
intent of the original code better.

Reported-by: "Tom G. Christensen" 
Helped-by: Jeff King 
Signed-off-by: Junio C Hamano 
---
 setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/setup.c b/setup.c
index 3439ec6..b4a92fe 100644
--- a/setup.c
+++ b/setup.c
@@ -103,7 +103,7 @@ char *prefix_path_gently(const char *prefix, int len,
return NULL;
}
} else {
-   sanitized = xstrfmt("%.*s%s", len, prefix, path);
+   sanitized = xstrfmt("%.*s%s", len, len ? prefix : "", path);
if (remaining_prefix)
*remaining_prefix = len;
if (normalize_path_copy_len(sanitized, sanitized, 
remaining_prefix)) {
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-07 Thread Jeff King
On Thu, Apr 07, 2016 at 11:50:46AM -0700, Junio C Hamano wrote:

> Junio C Hamano  writes:
> 
> > "Tom G. Christensen"  writes:
> >
> >> The reason for the crash is simple, a null value was passed to the 's'
> >> format for the *printf family of functions.
> >> ...
> >> Passing a null value to the 's' format is explicitly documented as
> >> giving undefined results on Solaris, even on Solaris 11(2).

Thanks, TIL (though it is not really surprising, I guess, since some
memcpy implementations have the same problem).

> So, I've looked at places where we use "%.*s" with "prefix" nearby,
> and it seems that this is the only place.

Thank you for digging; I obviously didn't think about this issue at all
when doing the mass conversions recently.

> The "prefix" being a NULL is a perfectly valid state throughout the
> system and means a different thing than it being an empty string, so
> it is valid for callers of prefix_path() and prefix_path_gently() to
> pass prefix=NULL as long as they pass len=0.
> 
> So perhaps this is all we need to fix your box.
> 
>  setup.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/setup.c b/setup.c
> index 3439ec6..b6c8aab 100644
> --- a/setup.c
> +++ b/setup.c
> @@ -103,7 +103,7 @@ char *prefix_path_gently(const char *prefix, int len,
>   return NULL;
>   }
>   } else {
> - sanitized = xstrfmt("%.*s%s", len, prefix, path);
> + sanitized = xstrfmt("%.*s%s", len, prefix ? prefix : "", path);
>   if (remaining_prefix)
>   *remaining_prefix = len;
>   if (normalize_path_copy_len(sanitized, sanitized, 
> remaining_prefix)) {

The original pre-75faa45ae0230b321bf72027b2274315d7e14e34 version
checked "if (len)", but I think this should be equally right.

-Peff
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-07 Thread Tom G. Christensen

On 07/04/16 20:32, Junio C Hamano wrote:

"Tom G. Christensen"  writes:


The reason for the crash is simple, a null value was passed to the 's'
format for the *printf family of functions.
...
Passing a null value to the 's' format is explicitly documented as
giving undefined results on Solaris, even on Solaris 11(2).


Do you mean

*printf("...%.*s...", ..., 0, NULL, ...)

i.e. you saw a NULL passed only when we use %.*s with width=0?



Maybe? Not sure what you're asking exactly.

I'm seing what is in the backtrace from gdb and that is prefix is NULL 
(0x0) which ends up being printed using some variant of '%s' after going 
through the various wrappers.


I hacked around it in run_builtin() as a proof and have also made some 
experiments with working around it in setup_git_directory_gently() which 
got me a bit further but it looks like there are places that do 
if(prefix) which now does not behave as expected because prefix is not NULL.


-tgc




--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-07 Thread David Turner
On Thu, 2016-04-07 at 11:50 -0700, Junio C Hamano wrote:
> Junio C Hamano  writes:
> 
> > "Tom G. Christensen"  writes:
> > 
> > > The reason for the crash is simple, a null value was passed to
> > > the 's'
> > > format for the *printf family of functions.
> > > ...
> > > Passing a null value to the 's' format is explicitly documented
> > > as
> > > giving undefined results on Solaris, even on Solaris 11(2).
> > 
> > Do you mean
> > 
> > *printf("...%.*s...", ..., 0, NULL, ...)
> > 
> > i.e. you saw a NULL passed only when we use %.*s with width=0?
> 
> So, I've looked at places where we use "%.*s" with "prefix" nearby,
> and it seems that this is the only place.
> 
> The "prefix" being a NULL is a perfectly valid state throughout the
> system and means a different thing than it being an empty string, so
> it is valid for callers of prefix_path() and prefix_path_gently() to
> pass prefix=NULL as long as they pass len=0.

TIOLI: The code below does not reflect the "as long as they pass len=0"
condition.  Maybe add an assert for that?  


--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-07 Thread Junio C Hamano
Junio C Hamano  writes:

> "Tom G. Christensen"  writes:
>
>> The reason for the crash is simple, a null value was passed to the 's'
>> format for the *printf family of functions.
>> ...
>> Passing a null value to the 's' format is explicitly documented as
>> giving undefined results on Solaris, even on Solaris 11(2).
>
> Do you mean
>
>   *printf("...%.*s...", ..., 0, NULL, ...)
>
> i.e. you saw a NULL passed only when we use %.*s with width=0?

So, I've looked at places where we use "%.*s" with "prefix" nearby,
and it seems that this is the only place.

The "prefix" being a NULL is a perfectly valid state throughout the
system and means a different thing than it being an empty string, so
it is valid for callers of prefix_path() and prefix_path_gently() to
pass prefix=NULL as long as they pass len=0.

So perhaps this is all we need to fix your box.

 setup.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/setup.c b/setup.c
index 3439ec6..b6c8aab 100644
--- a/setup.c
+++ b/setup.c
@@ -103,7 +103,7 @@ char *prefix_path_gently(const char *prefix, int len,
return NULL;
}
} else {
-   sanitized = xstrfmt("%.*s%s", len, prefix, path);
+   sanitized = xstrfmt("%.*s%s", len, prefix ? prefix : "", path);
if (remaining_prefix)
*remaining_prefix = len;
if (normalize_path_copy_len(sanitized, sanitized, 
remaining_prefix)) {
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: git segfaults on older Solaris releases

2016-04-07 Thread Junio C Hamano
"Tom G. Christensen"  writes:

> The reason for the crash is simple, a null value was passed to the 's'
> format for the *printf family of functions.
> ...
> Passing a null value to the 's' format is explicitly documented as
> giving undefined results on Solaris, even on Solaris 11(2).

Do you mean

*printf("...%.*s...", ..., 0, NULL, ...)

i.e. you saw a NULL passed only when we use %.*s with width=0?
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


git segfaults on older Solaris releases

2016-04-07 Thread Tom G. Christensen

Hello,

While working on an update to the git packages in tgcware(1) I ran into 
segfaults when running the testsuite.


Here's what it looks like on Solaris 7/SPARC:

Core was generated by 
`/export/home/tgc/buildpkg/git/src/git-upstream/git update-index 
should-be-empty'.

Program terminated with signal SIGSEGV, Segmentation fault.
#0  0xfee81ef4 in _doprnt () from /usr/lib/libc.so.1
(gdb) bt
#0  0xfee81ef4 in _doprnt () from /usr/lib/libc.so.1
#1  0xfee83ce4 in vsnprintf () from /usr/lib/libc.so.1
#2  0x00138dbc in strbuf_vaddf (sb=0xffbedd24, fmt=0x1af7b8 "%.*s%s", 
ap=0xffbedde0) at strbuf.c:279
#3  0x00139f78 in xstrvfmt (fmt=0x1af7b8 "%.*s%s", ap=0xffbedde0) at 
strbuf.c:698

#4  0x00139fb4 in xstrfmt (fmt=0x1af7b8 "%.*s%s") at strbuf.c:708
#5  0x0012a0ec in prefix_path_gently (prefix=0x0, len=0, 
remaining_prefix=0x0, path=) at setup.c:103
#6  0x0012a2f0 in prefix_path (prefix=0x0, len=0, path=0xffbee7fc 
"should-be-empty") at setup.c:116
#7  0x00098464 in cmd_update_index (argc=2, argv=, 
prefix=0x0) at builtin/update-index.c:1042
#8  0x00025900 in run_builtin (argv=0xffbee630, argc=2, p=0x1c9adc 
) at git.c:346

#9  handle_builtin (argc=2, argv=0xffbee630) at git.c:536
#10 0x00025bec in run_argv (argv=0xffbee5c4, argcp=0xffbee60c) at git.c:582
#11 main (argc=2, av=) at git.c:690
(gdb)


The reason for the crash is simple, a null value was passed to the 's' 
format for the *printf family of functions.
To verify I modified git.c:run_builtin() so it would assign "" to prefix 
if NULL just before the status = p->fn(..) call.
This allowed t-basic.sh to pass where before it would fail because 
git segfaulted in multiple tests.


Passing a null value to the 's' format is explicitly documented as 
giving undefined results on Solaris, even on Solaris 11(2).
It happens that Solaris 8 and later will tolerate this without crashing, 
though I suspect at least for Solaris 8 and 9 it might require a certain 
patchlevel to do so. Earlier releases will just segfault as shown above.


I bisected it on Solaris 2.6 and found that 75faa45 was the commit that 
caused this problem to appear. The 2.6.x releases build and run fine.


I know of course that Solaris < 8 is not terribly interesting as a 
portability target so I understand if you're unwilling to fix this as it 
seems it might be a somewhat invasive change.


-tgc

1) http://jupiterrise.com/tgcware/tgcware.solaris.html
2) http://docs.oracle.com/cd/E23824_01/html/821-1465/printf-3c.html
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html