Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths
On Fri, Oct 21, 2016 at 11:23:53AM -0700, Junio C Hamano wrote: > A request to "git:///", depending on , results > in "directory" given to path_ok() in a bit different forms. Namely, > connect.c::parse_connect_url() gives > > URL directory > git://site/nouser.git /nouser.git > git://site/~nouser.git~nouser.git > > by special casing "~user" syntax (in other words, a pathname that > begins with a tilde _is_ special cased, and tilde is not considered > a normal character that can be in a pathname). Note the lack of > leading slash in the second one. > > Because that is how the shipped clients work, the daemon needs a bit > of adjustment, because interpolation and base-path prefix codepaths > wants to accept only paths that begin with a slash, and relies on > the slash when interpolating it in the template or concatenating it > to the base (e.g. roughly "sprintf(buf, "%s%s", base_path, dir)"). > > I came up with the following as a less invasive patch. There no > longer is the ase where "user-path not allowed" error is given, > as treating tilde as just a normal pathname character even when it > appears at the beginning is the whole point of this change. Thanks for explaining this. It is rather gross, but I think your less-invasive patch is the best we could do given the client behavior. And it's more what I would have expected based on the original problem description. > As I said already, I am not sure if this is a good change, though. I also am on the fence. I'm not sure I understand a compelling reason to have a non-interpolated "~" in the repo path name. Sure, it's a limitation, but why does anybody care? > @@ -170,11 +171,11 @@ static const char *path_ok(const char *directory, > struct hostinfo *hi) > return NULL; > } > > + if (!user_path && *dir == '~') { > + snprintf(tilde_path, PATH_MAX, "/%s", dir); > + dir = tilde_path; > + } I know you are following existing convention in this function to use an unchecked snprintf(), but it makes me wonder what kind of mischief a client could get up to by silently truncating via snprintf. This function is, after all, supposed to be checking the quality of the incoming path. xsnprintf() is probably too blunt a hammer, but: if (snprintf(rpath, PATH_MAX, ...) >= PATH_MAX) { logerror("path too long"); return NULL; } in various places may be appropriate. -Peff
Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths
Junio C Hamano writes: > Duy Nguyen writes: > >> The amount of changes is unbelievable for fixing such a rare case >> though. I wonder if we can just detect this in daemon.c and pass >> "./~foo/bar" instead of "~foo/bar" to enter_repo() in non-strict mode >> to "disable" expand_user_path(). If it works, it's much simpler >> changes (even though a bit hacky) > > Conceptually, it ought to be updating the code that says "Does it > begin with a tilde? Then try to do user-path expansion and fail if > that is not enabled and if it succeeds use the resulting directory" > to "Is user-path enabled and does it begin with a tilde? Then try > to do user-path expansion and if it succeeds use the resulting > directory". Compared to that mental model we have with this > codepath, I agree that the change does look quite invasive and > large. > > It is OK for a change to be large, as long as the end result is > easier to read and understand than the original. I am undecided if > that is the case with this patch, though. I am still not convinced that this is a bugfix (as opposed to "add a new feature to please Luke while regressing it for others"), but the "this looks too invasive" made me look at the codepath involved. There is this code in daemon.c::path_ok() that takes "dir" and ends up calling enter_repo(). if (*dir == '~') { if (!user_path) { logerror("'%s': User-path not allowed", dir); return NULL; } At first glance, it ought to be a single-liner - if (*dir == '~') { + if (user_path && *dir == '~') { to make 'git://site/~foo.git" go to "/var/scm/~foo.git" when base path is set to /var/scm/. Unfortunately that is not sufficient. A request to "git:///", depending on , results in "directory" given to path_ok() in a bit different forms. Namely, connect.c::parse_connect_url() gives URL directory git://site/nouser.git /nouser.git git://site/~nouser.git ~nouser.git by special casing "~user" syntax (in other words, a pathname that begins with a tilde _is_ special cased, and tilde is not considered a normal character that can be in a pathname). Note the lack of leading slash in the second one. Because that is how the shipped clients work, the daemon needs a bit of adjustment, because interpolation and base-path prefix codepaths wants to accept only paths that begin with a slash, and relies on the slash when interpolating it in the template or concatenating it to the base (e.g. roughly "sprintf(buf, "%s%s", base_path, dir)"). I came up with the following as a less invasive patch. There no longer is the ase where "user-path not allowed" error is given, as treating tilde as just a normal pathname character even when it appears at the beginning is the whole point of this change. As I said already, I am not sure if this is a good change, though. daemon.c | 9 + t/t5570-git-daemon.sh | 11 +++ 2 files changed, 16 insertions(+), 4 deletions(-) diff --git a/daemon.c b/daemon.c index afce1b9b7f..e560a535af 100644 --- a/daemon.c +++ b/daemon.c @@ -160,6 +160,7 @@ static const char *path_ok(const char *directory, struct hostinfo *hi) { static char rpath[PATH_MAX]; static char interp_path[PATH_MAX]; + char tilde_path[PATH_MAX]; const char *path; const char *dir; @@ -170,11 +171,11 @@ static const char *path_ok(const char *directory, struct hostinfo *hi) return NULL; } + if (!user_path && *dir == '~') { + snprintf(tilde_path, PATH_MAX, "/%s", dir); + dir = tilde_path; + } if (*dir == '~') { - if (!user_path) { - logerror("'%s': User-path not allowed", dir); - return NULL; - } if (*user_path) { /* Got either "~alice" or "~alice/foo"; * rewrite them to "~alice/%s" or diff --git a/t/t5570-git-daemon.sh b/t/t5570-git-daemon.sh index 225a022e8a..b6d2b9a001 100755 --- a/t/t5570-git-daemon.sh +++ b/t/t5570-git-daemon.sh @@ -65,6 +65,17 @@ test_expect_success 'remote detects correct HEAD' ' ) ' +test_expect_success 'tilde is a normal character without --user-path' ' + nouser="~nouser.git" && + nouser_repo="$GIT_DAEMON_DOCUMENT_ROOT_PATH/$nouser" && + mkdir "$nouser_repo" && + git -C "$nouser_repo" --bare init && + >"$nouser_repo/git-daemon-export-ok" && + git push "$nouser_repo" master:master && + + git ls-remote "$GIT_DAEMON_URL/$nouser" +' + test_expect_success 'prepare pack objects' ' cp -R "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo.git "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_pack.git && (cd "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_pack.git &&
Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths
Duy Nguyen writes: > The amount of changes is unbelievable for fixing such a rare case > though. I wonder if we can just detect this in daemon.c and pass > "./~foo/bar" instead of "~foo/bar" to enter_repo() in non-strict mode > to "disable" expand_user_path(). If it works, it's much simpler > changes (even though a bit hacky) Conceptually, it ought to be updating the code that says "Does it begin with a tilde? Then try to do user-path expansion and fail if that is not enabled and if it succeeds use the resulting directory" to "Is user-path enabled and does it begin with a tilde? Then try to do user-path expansion and if it succeeds use the resulting directory". Compared to that mental model we have with this codepath, I agree that the change does look quite invasive and large. It is OK for a change to be large, as long as the end result is easier to read and understand than the original. I am undecided if that is the case with this patch, though. Also I am a bit worried about the change in the semantics. While it is not the best way to achieve this, the server operators could use it as a way to add directories whose contents need to be hidden to give them names that begin with a tilde without enabling user-path expansion. This change may be a new and useful feature for Luke, but to these server operators this change can be a new bug---it is probably a minor new bug as they can work it around by using other means to hide these directories, though.
Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths
On Wed, Oct 19, 2016 at 1:05 AM, Luke Shumaker wrote: >> I am not sure if it is even a bug. As you can easily lose that >> tilde that appears in front of subdirectory of /srv/git/ or replace >> it with something else (e.g. "u/"), this smells like "Don't do it if >> it hurts" thing to me. > > I buy into "Don't do it if it hurts", but that doesn't mean it's not a > bug on an uncommon edge-case. The amount of changes is unbelievable for fixing such a rare case though. I wonder if we can just detect this in daemon.c and pass "./~foo/bar" instead of "~foo/bar" to enter_repo() in non-strict mode to "disable" expand_user_path(). If it works, it's much simpler changes (even though a bit hacky) -- Duy
Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths
On Tue, 18 Oct 2016 13:08:45 -0400, Junio C Hamano wrote: > > Luke Shumaker writes: > > > The superficial aspect of this change is that git-daemon now allows paths > > that start with a "~". Previously, if git-daemon was run with > > "--base-path=/srv/git", it was impossible to get it to serve > > "/srv/git/~foo/bar.git". > > I am not sure I understand what you are saying here. Do you mean > > I have a path on my server /srv/git/~foo/bar.git; the tilde does > not mean anything special--it is just a byte in a valid pathname. > > I want to allow my users to say > > git fetch git://my.server/~foo/bar.git > > and fetch from that repository, but "git daemon" lacks the way > to configure to allow it. Yes, that is what I am saying. > If that is the case, what happens instead? Due to the leading > "~foo/" getting noticed as an attempt to use the user-path expansion > it is not treated as just a literal character? What happens instead is if (*dir == '~') { if (!user_path) { logerror("'%s': User-path not allowed", dir); return NULL; } which to the user looks like git clone git://my.server/~foo/bar.git Cloning into 'bar'... fatal: remote error: access denied or repository not exported: ~foo/bar.git > I am not sure if it is even a bug. As you can easily lose that > tilde that appears in front of subdirectory of /srv/git/ or replace > it with something else (e.g. "u/"), this smells like "Don't do it if > it hurts" thing to me. I buy into "Don't do it if it hurts", but that doesn't mean it's not a bug on an uncommon edge-case. Note that it doesn't hurt with git-shell or cgit (I haven't checked with gitweb). Many programs (especially shell scripts) fail to deal with filenames containing a space. "Don't put spaces in filenames if it hurts". It's still a bug in the program. Similarly, `git gui` used to not be able to add a file in a directory starting with '~' (when one clicked the file named "~foo/bar", it said something along the lines of "/home/~foo/bar is outside repository"), and one had to use `git add '~foo/bar` directly. "Don't do it if it hurts"; it was still a bug. Aside: one (somewhat silly) non-user reason that I've seen for a directory to start with '~' is that it sorts after all other ASCII characters; it moves the directory to the end of any lists. -- Happy hacking, ~ Luke Shumaker
Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths
Luke Shumaker writes: > The superficial aspect of this change is that git-daemon now allows paths > that start with a "~". Previously, if git-daemon was run with > "--base-path=/srv/git", it was impossible to get it to serve > "/srv/git/~foo/bar.git". I am not sure I understand what you are saying here. Do you mean I have a path on my server /srv/git/~foo/bar.git; the tilde does not mean anything special--it is just a byte in a valid pathname. I want to allow my users to say git fetch git://my.server/~foo/bar.git and fetch from that repository, but "git daemon" lacks the way to configure to allow it. If that is the case, what happens instead? Due to the leading "~foo/" getting noticed as an attempt to use the user-path expansion it is not treated as just a literal character? I am not sure if it is even a bug. As you can easily lose that tilde that appears in front of subdirectory of /srv/git/ or replace it with something else (e.g. "u/"), this smells like "Don't do it if it hurts" thing to me.
[PATCH] daemon, path.c: fix a bug with ~ in repo paths
The superficial aspect of this change is that git-daemon now allows paths that start with a "~". Previously, if git-daemon was run with "--base-path=/srv/git", it was impossible to get it to serve "/srv/git/~foo/bar.git". An odd edge-case that was broken. But from a source-code standpoint, the change is in path.c:enter_repo(). I have adjusted it to take separate "strict_prefix" and "strict_suffix" arguments, rather than a single "strict" argument. I also make it clearer what the purpose of each path buffer is for, by renaming them to chdir_path and ret_path; chdir_path is the path that we pass to chdir(); return_path is the path we return to the user. Using this nomenclature, we can more easily explain the behavior of the function. There are 3 DWIM measures that enter_repo() provides: tilde expansion, suffix guessing, and gitfile expansion; it also trims trailing slashes. Here is how they are applied to each path: +--+++ | Before this commit | chdir_path | ret_path | +--+++ | trim trailing slashes| !strict| !strict| | tilde expansion | !strict| false | | suffix guessing | !strict| !strict| | gitfile expansion (< 2.6.3) | !strict| false | | gitfile expansion (>= 2.6.3) | true | strict | +--+++ | With this commit | chdir_path | ret_path | +--+++ | trim trailing slashes| true | true | | tilde expansion | !strict_prefix | false | | suffix guessing | !strict_suffix | !strict_suffix | | gitfile expansion| true | false | +--+++ The separation of "strict" into "strict_prefix" and "strict_suffix" is necessary for git-daemon because it has separate --strict-paths (affects prefix and suffix) and --user-path (just prefix) flags that can be toggled separately. In the other programs where enter_repo() is called, I continued the existing behavior of tying the prefix and suffix strictness together together; though I am not entirely sure that they should all be enabling tilde expansion. But for now, their behavior hasn't changed. Signed-off-by: Luke Shumaker --- builtin/receive-pack.c | 2 +- builtin/upload-archive.c | 2 +- cache.h | 2 +- daemon.c | 42 +++ http-backend.c | 2 +- path.c | 135 +-- upload-pack.c| 2 +- 7 files changed, 96 insertions(+), 91 deletions(-) diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c index 011db00..f430e96 100644 --- a/builtin/receive-pack.c +++ b/builtin/receive-pack.c @@ -1860,7 +1860,7 @@ int cmd_receive_pack(int argc, const char **argv, const char *prefix) setup_path(); - if (!enter_repo(service_dir, 0)) + if (!enter_repo(service_dir, 0, 0)) die("'%s' does not appear to be a git repository", service_dir); git_config(receive_pack_config, NULL); diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c index 2caedf1..00d4ced 100644 --- a/builtin/upload-archive.c +++ b/builtin/upload-archive.c @@ -25,7 +25,7 @@ int cmd_upload_archive_writer(int argc, const char **argv, const char *prefix) if (argc != 2) usage(upload_archive_usage); - if (!enter_repo(argv[1], 0)) + if (!enter_repo(argv[1], 0, 0)) die("'%s' does not appear to be a git repository", argv[1]); /* put received options in sent_argv[] */ diff --git a/cache.h b/cache.h index 4cba08e..6380be0 100644 --- a/cache.h +++ b/cache.h @@ -1024,7 +1024,7 @@ enum scld_error safe_create_leading_directories_const(const char *path); int mkdir_in_gitdir(const char *path); extern char *expand_user_path(const char *path); -const char *enter_repo(const char *path, int strict); +const char *enter_repo(const char *path, int strict_prefix, int strict_suffix); static inline int is_absolute_path(const char *path) { return is_dir_sep(path[0]) || has_dos_drive_prefix(path); diff --git a/daemon.c b/daemon.c index 425aad0..118d337 100644 --- a/daemon.c +++ b/daemon.c @@ -170,27 +170,23 @@ static const char *path_ok(const char *directory, struct hostinfo *hi) return NULL; } - if (*dir == '~') { - if (!user_path) { - logerror("'%s': User-path not allowed", dir); - return NULL; - } - if (*user_path) { - /* Got eit