Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites

2013-01-07 Thread Erik Faye-Lund
On Sat, Jan 5, 2013 at 1:35 AM, David Michael fedora@gmail.com wrote:
 It is possible for this pointer of the GIT_DIR environment variable to
 survive unduplicated until further getenv calls are made.  The standards
 allow for subsequent calls of getenv to overwrite the string located at
 its returned pointer, and this can result in broken git operations on
 certain platforms.

 Signed-off-by: David Michael fedora@gmail.com
 ---

 I have encountered an issue with consecutive calls to getenv
 overwriting earlier values.  Most notably, it prevents a plain git
 clone from working.

 Long story short: This value of GIT_DIR gets passed around setup.c
 until it reaches check_repository_format_gently.  This function calls
 git_config_early, which eventually runs getenv(HOME).  When it
 returns back to check_repository_format_gently, the gitdir variable
 contains my home directory path.  The end result is that I wind up
 with ~/objects/ etc. and a failed repository clone.  (Simply adding a
 bare getenv(GIT_DIR) afterwards to reset the pointer also corrects
 the problem.)

 Since other platforms are apparently working, yet this getenv behavior
 is supported by the standards, I am left wondering if this could be a
 symptom of something else being broken on my platform (z/OS).  Can
 anyone more familiar with this part of git identify any condition that
 obviously should not be occurring?

 Thanks.

I have some patches of a similar nature here:

https://github.com/kusma/git/commits/work/getenv-safety

These were written for an earlier version of the UTF-8 patches for Git
for Windows, where we were looking into allowing getenv to use a
static buffer to convert the environment variables from UTF-16 (which
is what Windows maintains) to UTF-8. We ended converting the
environment on start-up instead, so these weren't needed for us. But
perhaps they can be of use to someone else?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[BUG/PATCH] setup: Copy an environment variable to avoid overwrites

2013-01-04 Thread David Michael
It is possible for this pointer of the GIT_DIR environment variable to
survive unduplicated until further getenv calls are made.  The standards
allow for subsequent calls of getenv to overwrite the string located at
its returned pointer, and this can result in broken git operations on
certain platforms.

Signed-off-by: David Michael fedora@gmail.com
---

I have encountered an issue with consecutive calls to getenv
overwriting earlier values.  Most notably, it prevents a plain git
clone from working.

Long story short: This value of GIT_DIR gets passed around setup.c
until it reaches check_repository_format_gently.  This function calls
git_config_early, which eventually runs getenv(HOME).  When it
returns back to check_repository_format_gently, the gitdir variable
contains my home directory path.  The end result is that I wind up
with ~/objects/ etc. and a failed repository clone.  (Simply adding a
bare getenv(GIT_DIR) afterwards to reset the pointer also corrects
the problem.)

Since other platforms are apparently working, yet this getenv behavior
is supported by the standards, I am left wondering if this could be a
symptom of something else being broken on my platform (z/OS).  Can
anyone more familiar with this part of git identify any condition that
obviously should not be occurring?

Thanks.

 setup.c | 8 ++--
 1 file changed, 6 insertions(+), 2 deletions(-)

diff --git a/setup.c b/setup.c
index f108c4b..64fb160 100644
--- a/setup.c
+++ b/setup.c
@@ -675,8 +675,12 @@ static const char
*setup_git_directory_gently_1(int *nongit_ok)
  * validation.
  */
 gitdirenv = getenv(GIT_DIR_ENVIRONMENT);
-if (gitdirenv)
-return setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok);
+if (gitdirenv) {
+gitdirenv = xstrdup(gitdirenv);
+ret = setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok);
+free(gitdirenv);
+return ret;
+}

 if (env_ceiling_dirs) {
 string_list_split(ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1);
--
1.7.11.7
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites

2013-01-04 Thread Junio C Hamano
David Michael fedora@gmail.com writes:

 I have encountered an issue with consecutive calls to getenv
 overwriting earlier values.  Most notably, it prevents a plain git
 clone from working.

 Long story short: This value of GIT_DIR gets passed around setup.c
 until it reaches check_repository_format_gently.  This function calls
 git_config_early, which eventually runs getenv(HOME).  When it
 returns back to check_repository_format_gently, the gitdir variable
 contains my home directory path.  The end result is that I wind up
 with ~/objects/ etc. and a failed repository clone.  (Simply adding a
 bare getenv(GIT_DIR) afterwards to reset the pointer also corrects
 the problem.)

 Since other platforms are apparently working, yet this getenv behavior
 is supported by the standards, I am left wondering if this could be a
 symptom of something else being broken on my platform (z/OS).

The execve(2) function

   int execve(const char *filename, char *const argv[],
  char *const envp[]);

takes a NULL terminated array of NUL terminated strings of form
VAR=VAL in envp[], and this is kept in:

extern char **environ;

of the new image that runs.

The most naive and straight-forward way to implement getenv(3) is to
iterate over this environ[] array to look for an element that begins
with GIT_DIR=, and return the pointer pointing at the location one
byte past that equal sign.  So even if the standard allowed the
returned value to be volatile across calls to getenv(3), it will
take *more* work for implementations if they want to break our use
pattern.  They have to deliberately return a string that they will
overwrite in subsequent calls to getenv(3).

Also the natural way to implement putenv(3) and setenv(3) is to
replace the pointer in the environ[] array (not overwrite the
existing string in environ[] that holds the VAR=VAL string that
represents the current value, which might be shorter than the new
value of the enviornment variable), hence even calling these
functions is unlikely to invalidate the result you previously
received from getenv(3).

I am not at all surprised that nobody from other platforms has seen
this breakage.

In fact,

http://pubs.opengroup.org/onlinepubs/9699919799/functions/getenv.html

says that only setenv(), unsetenv() and putenv() may invalidate
previous return values.  Note that getenv() is not listed as a
function that is allowed to break return values from a previous call
to getenv().

So in short, your platform's getenv(3) emulation may be broken.  We
have other calls to getenv() where we rely on the result of it being
stable, and you might discover these places also break.

Having said that, we do have codepaths to update a handful of
environment variables ourselves (GIT_DIR is among them), so I think
your patch is a good safety measure in general.

  setup.c | 8 ++--
  1 file changed, 6 insertions(+), 2 deletions(-)

 diff --git a/setup.c b/setup.c
 index f108c4b..64fb160 100644
 --- a/setup.c
 +++ b/setup.c
 @@ -675,8 +675,12 @@ static const char
 *setup_git_directory_gently_1(int *nongit_ok)
   * validation.
   */
  gitdirenv = getenv(GIT_DIR_ENVIRONMENT);
 -if (gitdirenv)
 -return setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok);
 +if (gitdirenv) {
 +gitdirenv = xstrdup(gitdirenv);
 +ret = setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok);
 +free(gitdirenv);
 +return ret;
 +}

  if (env_ceiling_dirs) {
  string_list_split(ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1);
 --
 1.7.11.7
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites

2013-01-04 Thread Duy Nguyen
On Sat, Jan 5, 2013 at 7:35 AM, David Michael fedora@gmail.com wrote:
 -if (gitdirenv)
 -return setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok);
 +if (gitdirenv) {
 +gitdirenv = xstrdup(gitdirenv);
 +ret = setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok);
 +free(gitdirenv);
 +return ret;
 +}

Maybe we could all this into a wrapper? If getenv() here has a
problem, many other places may have the same problem too. This
simplifies the change. But one has to check that getenv() must not be
used in threaded code.

char *git_getenv(const char *env)
{
   static int bufno;
   static char *buf[4];
   bufno = (bufno + 1) % 4;
   free(buf[bufno]);
   buf[bufno] = xstrdup(getenv(env));
   return buf[bufno];
}
#define getenv(x) git_getenv(x)

-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites

2013-01-04 Thread Junio C Hamano
Junio C Hamano gits...@pobox.com writes:

 ...  So even if the standard allowed the
 returned value to be volatile across calls to getenv(3),...
 ...
 In fact,

 http://pubs.opengroup.org/onlinepubs/9699919799/functions/getenv.html

 says that only ...

Apparently I wasn't even reading what I was quoting carefully
enough.  The above does include getenv() as one of the functions
that are allowed to invalidate earlier return values.

Sorry about that.  I'll go back to bed (I am a bit under the weather
and OOO today).  The conclusion in my original message is still
valid.

 Having said that, we do have codepaths to update a handful of
 environment variables ourselves (GIT_DIR is among them), so I think
 your patch is a good safety measure in general.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites

2013-01-04 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 Maybe we could all this into a wrapper? If getenv() here has a
 problem, many other places may have the same problem too. This
 simplifies the change. But one has to check that getenv() must not be
 used in threaded code.

That needs to be done regardless, if we care; POSIX explicitly says
getenv() need not be thread-safe.

I personally do not think a wrapper with limited slots is a healthy
direction to go.  Most places we use getenv() do not let the return
value live across their scope, and those that do should explicitly
copy the value away.  It's between validating that there is _no_ *env()
calls in the codepath between a getenv() call and the use of its
return value, and validating that there is at most 4 such calls there.
The former is much easier to verify and maintain, I think.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites

2013-01-04 Thread Duy Nguyen
On Sat, Jan 5, 2013 at 11:38 AM, Junio C Hamano gits...@pobox.com wrote:
 I personally do not think a wrapper with limited slots is a healthy
 direction to go.  Most places we use getenv() do not let the return
 value live across their scope, and those that do should explicitly
 copy the value away.  It's between validating that there is _no_ *env()
 calls in the codepath between a getenv() call and the use of its
 return value, and validating that there is at most 4 such calls there.
 The former is much easier to verify and maintain, I think.

I did not look carefully and was scared of 143 getenv calls. But with
about 4 calls, yes it's best to do without the wrapper.
-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites

2013-01-04 Thread Junio C Hamano
Duy Nguyen pclo...@gmail.com writes:

 On Sat, Jan 5, 2013 at 11:38 AM, Junio C Hamano gits...@pobox.com wrote:
 I personally do not think a wrapper with limited slots is a healthy
 direction to go.  Most places we use getenv() do not let the return
 value live across their scope, and those that do should explicitly
 copy the value away.  It's between validating that there is _no_ *env()
 calls in the codepath between a getenv() call and the use of its
 return value, and validating that there is at most 4 such calls there.
 The former is much easier to verify and maintain, I think.

 I did not look carefully and was scared of 143 getenv calls. But with
 about 4 calls, yes it's best to do without the wrapper.

Just to make sure you did not misunderstand, the 4 (four) in my
message is not about 4 calls among 143 are unsafe.

It was referring to the number of rotating slots your patch defined,
which means

val = getenv(FOO);
... random other code ...
use(val)

is safe only if random other code makes less than 4 getenv() calls.

I didn't verify all of the call sites. It needs to be done with or
without your wrapper patch. Without your wrapper, the validation
needs to make sure random other code above does not make any getenv()
call.  With your wrapper, the validation needs to make sure random
other code above does not make more than 4 such calls.  My point
was that the effort needed for both are about the same, so your
wrapper does not buy us much.

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html