Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites
On Sat, Jan 5, 2013 at 1:35 AM, David Michael fedora@gmail.com wrote: It is possible for this pointer of the GIT_DIR environment variable to survive unduplicated until further getenv calls are made. The standards allow for subsequent calls of getenv to overwrite the string located at its returned pointer, and this can result in broken git operations on certain platforms. Signed-off-by: David Michael fedora@gmail.com --- I have encountered an issue with consecutive calls to getenv overwriting earlier values. Most notably, it prevents a plain git clone from working. Long story short: This value of GIT_DIR gets passed around setup.c until it reaches check_repository_format_gently. This function calls git_config_early, which eventually runs getenv(HOME). When it returns back to check_repository_format_gently, the gitdir variable contains my home directory path. The end result is that I wind up with ~/objects/ etc. and a failed repository clone. (Simply adding a bare getenv(GIT_DIR) afterwards to reset the pointer also corrects the problem.) Since other platforms are apparently working, yet this getenv behavior is supported by the standards, I am left wondering if this could be a symptom of something else being broken on my platform (z/OS). Can anyone more familiar with this part of git identify any condition that obviously should not be occurring? Thanks. I have some patches of a similar nature here: https://github.com/kusma/git/commits/work/getenv-safety These were written for an earlier version of the UTF-8 patches for Git for Windows, where we were looking into allowing getenv to use a static buffer to convert the environment variables from UTF-16 (which is what Windows maintains) to UTF-8. We ended converting the environment on start-up instead, so these weren't needed for us. But perhaps they can be of use to someone else? -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[BUG/PATCH] setup: Copy an environment variable to avoid overwrites
It is possible for this pointer of the GIT_DIR environment variable to survive unduplicated until further getenv calls are made. The standards allow for subsequent calls of getenv to overwrite the string located at its returned pointer, and this can result in broken git operations on certain platforms. Signed-off-by: David Michael fedora@gmail.com --- I have encountered an issue with consecutive calls to getenv overwriting earlier values. Most notably, it prevents a plain git clone from working. Long story short: This value of GIT_DIR gets passed around setup.c until it reaches check_repository_format_gently. This function calls git_config_early, which eventually runs getenv(HOME). When it returns back to check_repository_format_gently, the gitdir variable contains my home directory path. The end result is that I wind up with ~/objects/ etc. and a failed repository clone. (Simply adding a bare getenv(GIT_DIR) afterwards to reset the pointer also corrects the problem.) Since other platforms are apparently working, yet this getenv behavior is supported by the standards, I am left wondering if this could be a symptom of something else being broken on my platform (z/OS). Can anyone more familiar with this part of git identify any condition that obviously should not be occurring? Thanks. setup.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/setup.c b/setup.c index f108c4b..64fb160 100644 --- a/setup.c +++ b/setup.c @@ -675,8 +675,12 @@ static const char *setup_git_directory_gently_1(int *nongit_ok) * validation. */ gitdirenv = getenv(GIT_DIR_ENVIRONMENT); -if (gitdirenv) -return setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok); +if (gitdirenv) { +gitdirenv = xstrdup(gitdirenv); +ret = setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok); +free(gitdirenv); +return ret; +} if (env_ceiling_dirs) { string_list_split(ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1); -- 1.7.11.7 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites
David Michael fedora@gmail.com writes: I have encountered an issue with consecutive calls to getenv overwriting earlier values. Most notably, it prevents a plain git clone from working. Long story short: This value of GIT_DIR gets passed around setup.c until it reaches check_repository_format_gently. This function calls git_config_early, which eventually runs getenv(HOME). When it returns back to check_repository_format_gently, the gitdir variable contains my home directory path. The end result is that I wind up with ~/objects/ etc. and a failed repository clone. (Simply adding a bare getenv(GIT_DIR) afterwards to reset the pointer also corrects the problem.) Since other platforms are apparently working, yet this getenv behavior is supported by the standards, I am left wondering if this could be a symptom of something else being broken on my platform (z/OS). The execve(2) function int execve(const char *filename, char *const argv[], char *const envp[]); takes a NULL terminated array of NUL terminated strings of form VAR=VAL in envp[], and this is kept in: extern char **environ; of the new image that runs. The most naive and straight-forward way to implement getenv(3) is to iterate over this environ[] array to look for an element that begins with GIT_DIR=, and return the pointer pointing at the location one byte past that equal sign. So even if the standard allowed the returned value to be volatile across calls to getenv(3), it will take *more* work for implementations if they want to break our use pattern. They have to deliberately return a string that they will overwrite in subsequent calls to getenv(3). Also the natural way to implement putenv(3) and setenv(3) is to replace the pointer in the environ[] array (not overwrite the existing string in environ[] that holds the VAR=VAL string that represents the current value, which might be shorter than the new value of the enviornment variable), hence even calling these functions is unlikely to invalidate the result you previously received from getenv(3). I am not at all surprised that nobody from other platforms has seen this breakage. In fact, http://pubs.opengroup.org/onlinepubs/9699919799/functions/getenv.html says that only setenv(), unsetenv() and putenv() may invalidate previous return values. Note that getenv() is not listed as a function that is allowed to break return values from a previous call to getenv(). So in short, your platform's getenv(3) emulation may be broken. We have other calls to getenv() where we rely on the result of it being stable, and you might discover these places also break. Having said that, we do have codepaths to update a handful of environment variables ourselves (GIT_DIR is among them), so I think your patch is a good safety measure in general. setup.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/setup.c b/setup.c index f108c4b..64fb160 100644 --- a/setup.c +++ b/setup.c @@ -675,8 +675,12 @@ static const char *setup_git_directory_gently_1(int *nongit_ok) * validation. */ gitdirenv = getenv(GIT_DIR_ENVIRONMENT); -if (gitdirenv) -return setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok); +if (gitdirenv) { +gitdirenv = xstrdup(gitdirenv); +ret = setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok); +free(gitdirenv); +return ret; +} if (env_ceiling_dirs) { string_list_split(ceiling_dirs, env_ceiling_dirs, PATH_SEP, -1); -- 1.7.11.7 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites
On Sat, Jan 5, 2013 at 7:35 AM, David Michael fedora@gmail.com wrote: -if (gitdirenv) -return setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok); +if (gitdirenv) { +gitdirenv = xstrdup(gitdirenv); +ret = setup_explicit_git_dir(gitdirenv, cwd, len, nongit_ok); +free(gitdirenv); +return ret; +} Maybe we could all this into a wrapper? If getenv() here has a problem, many other places may have the same problem too. This simplifies the change. But one has to check that getenv() must not be used in threaded code. char *git_getenv(const char *env) { static int bufno; static char *buf[4]; bufno = (bufno + 1) % 4; free(buf[bufno]); buf[bufno] = xstrdup(getenv(env)); return buf[bufno]; } #define getenv(x) git_getenv(x) -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites
Junio C Hamano gits...@pobox.com writes: ... So even if the standard allowed the returned value to be volatile across calls to getenv(3),... ... In fact, http://pubs.opengroup.org/onlinepubs/9699919799/functions/getenv.html says that only ... Apparently I wasn't even reading what I was quoting carefully enough. The above does include getenv() as one of the functions that are allowed to invalidate earlier return values. Sorry about that. I'll go back to bed (I am a bit under the weather and OOO today). The conclusion in my original message is still valid. Having said that, we do have codepaths to update a handful of environment variables ourselves (GIT_DIR is among them), so I think your patch is a good safety measure in general. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites
Duy Nguyen pclo...@gmail.com writes: Maybe we could all this into a wrapper? If getenv() here has a problem, many other places may have the same problem too. This simplifies the change. But one has to check that getenv() must not be used in threaded code. That needs to be done regardless, if we care; POSIX explicitly says getenv() need not be thread-safe. I personally do not think a wrapper with limited slots is a healthy direction to go. Most places we use getenv() do not let the return value live across their scope, and those that do should explicitly copy the value away. It's between validating that there is _no_ *env() calls in the codepath between a getenv() call and the use of its return value, and validating that there is at most 4 such calls there. The former is much easier to verify and maintain, I think. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites
On Sat, Jan 5, 2013 at 11:38 AM, Junio C Hamano gits...@pobox.com wrote: I personally do not think a wrapper with limited slots is a healthy direction to go. Most places we use getenv() do not let the return value live across their scope, and those that do should explicitly copy the value away. It's between validating that there is _no_ *env() calls in the codepath between a getenv() call and the use of its return value, and validating that there is at most 4 such calls there. The former is much easier to verify and maintain, I think. I did not look carefully and was scared of 143 getenv calls. But with about 4 calls, yes it's best to do without the wrapper. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [BUG/PATCH] setup: Copy an environment variable to avoid overwrites
Duy Nguyen pclo...@gmail.com writes: On Sat, Jan 5, 2013 at 11:38 AM, Junio C Hamano gits...@pobox.com wrote: I personally do not think a wrapper with limited slots is a healthy direction to go. Most places we use getenv() do not let the return value live across their scope, and those that do should explicitly copy the value away. It's between validating that there is _no_ *env() calls in the codepath between a getenv() call and the use of its return value, and validating that there is at most 4 such calls there. The former is much easier to verify and maintain, I think. I did not look carefully and was scared of 143 getenv calls. But with about 4 calls, yes it's best to do without the wrapper. Just to make sure you did not misunderstand, the 4 (four) in my message is not about 4 calls among 143 are unsafe. It was referring to the number of rotating slots your patch defined, which means val = getenv(FOO); ... random other code ... use(val) is safe only if random other code makes less than 4 getenv() calls. I didn't verify all of the call sites. It needs to be done with or without your wrapper patch. Without your wrapper, the validation needs to make sure random other code above does not make any getenv() call. With your wrapper, the validation needs to make sure random other code above does not make more than 4 such calls. My point was that the effort needed for both are about the same, so your wrapper does not buy us much. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html