Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths

2016-10-21 Thread Jeff King
On Fri, Oct 21, 2016 at 11:23:53AM -0700, Junio C Hamano wrote:

> A request to "git:///", depending on , results
> in "directory" given to path_ok() in a bit different forms.  Namely,
> connect.c::parse_connect_url() gives
> 
> URL   directory
> git://site/nouser.git /nouser.git
> git://site/~nouser.git~nouser.git
> 
> by special casing "~user" syntax (in other words, a pathname that
> begins with a tilde _is_ special cased, and tilde is not considered
> a normal character that can be in a pathname).  Note the lack of
> leading slash in the second one.
> 
> Because that is how the shipped clients work, the daemon needs a bit
> of adjustment, because interpolation and base-path prefix codepaths
> wants to accept only paths that begin with a slash, and relies on
> the slash when interpolating it in the template or concatenating it
> to the base (e.g. roughly "sprintf(buf, "%s%s", base_path, dir)").
> 
> I came up with the following as a less invasive patch.  There no
> longer is the ase where "user-path not allowed" error is given,
> as treating tilde as just a normal pathname character even when it
> appears at the beginning is the whole point of this change.

Thanks for explaining this. It is rather gross, but I think your
less-invasive patch is the best we could do given the client behavior.
And it's more what I would have expected based on the original problem
description.

> As I said already, I am not sure if this is a good change, though.

I also am on the fence. I'm not sure I understand a compelling reason to
have a non-interpolated "~" in the repo path name. Sure, it's a
limitation, but why does anybody care?

> @@ -170,11 +171,11 @@ static const char *path_ok(const char *directory, 
> struct hostinfo *hi)
>   return NULL;
>   }
>  
> + if (!user_path && *dir == '~') {
> + snprintf(tilde_path, PATH_MAX, "/%s", dir);
> + dir = tilde_path;
> + }

I know you are following existing convention in this function to use an
unchecked snprintf(), but it makes me wonder what kind of mischief a
client could get up to by silently truncating via snprintf. This
function is, after all, supposed to be checking the quality of the
incoming path.

xsnprintf() is probably too blunt a hammer, but:

  if (snprintf(rpath, PATH_MAX, ...) >= PATH_MAX) {
logerror("path too long");
return NULL;
  }

in various places may be appropriate.

-Peff


Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths

2016-10-21 Thread Junio C Hamano
Junio C Hamano  writes:

> Duy Nguyen  writes:
>
>> The amount of changes is unbelievable for fixing such a rare case
>> though. I wonder if we can just detect this in daemon.c and pass
>> "./~foo/bar" instead of "~foo/bar" to enter_repo() in non-strict mode
>> to "disable" expand_user_path(). If it works, it's much simpler
>> changes (even though a bit hacky)
>
> Conceptually, it ought to be updating the code that says "Does it
> begin with a tilde?  Then try to do user-path expansion and fail if
> that is not enabled and if it succeeds use the resulting directory"
> to "Is user-path enabled and does it begin with a tilde?  Then try
> to do user-path expansion and if it succeeds use the resulting
> directory".  Compared to that mental model we have with this
> codepath, I agree that the change does look quite invasive and
> large.
>
> It is OK for a change to be large, as long as the end result is
> easier to read and understand than the original.  I am undecided if
> that is the case with this patch, though.

I am still not convinced that this is a bugfix (as opposed to "add a
new feature to please Luke while regressing it for others"), but the
"this looks too invasive" made me look at the codepath involved.

There is this code in daemon.c::path_ok() that takes "dir" and ends
up calling enter_repo().

if (*dir == '~') {
if (!user_path) {
logerror("'%s': User-path not allowed", dir);
return NULL;
}

At first glance, it ought to be a single-liner

-   if (*dir == '~') {
+   if (user_path && *dir == '~') {

to make 'git://site/~foo.git" go to "/var/scm/~foo.git" when base
path is set to /var/scm/.  

Unfortunately that is not sufficient.

A request to "git:///", depending on , results
in "directory" given to path_ok() in a bit different forms.  Namely,
connect.c::parse_connect_url() gives

URL directory
git://site/nouser.git   /nouser.git
git://site/~nouser.git  ~nouser.git

by special casing "~user" syntax (in other words, a pathname that
begins with a tilde _is_ special cased, and tilde is not considered
a normal character that can be in a pathname).  Note the lack of
leading slash in the second one.

Because that is how the shipped clients work, the daemon needs a bit
of adjustment, because interpolation and base-path prefix codepaths
wants to accept only paths that begin with a slash, and relies on
the slash when interpolating it in the template or concatenating it
to the base (e.g. roughly "sprintf(buf, "%s%s", base_path, dir)").

I came up with the following as a less invasive patch.  There no
longer is the ase where "user-path not allowed" error is given,
as treating tilde as just a normal pathname character even when it
appears at the beginning is the whole point of this change.

As I said already, I am not sure if this is a good change, though.
 daemon.c  |  9 +
 t/t5570-git-daemon.sh | 11 +++
 2 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/daemon.c b/daemon.c
index afce1b9b7f..e560a535af 100644
--- a/daemon.c
+++ b/daemon.c
@@ -160,6 +160,7 @@ static const char *path_ok(const char *directory, struct 
hostinfo *hi)
 {
static char rpath[PATH_MAX];
static char interp_path[PATH_MAX];
+   char tilde_path[PATH_MAX];
const char *path;
const char *dir;
 
@@ -170,11 +171,11 @@ static const char *path_ok(const char *directory, struct 
hostinfo *hi)
return NULL;
}
 
+   if (!user_path && *dir == '~') {
+   snprintf(tilde_path, PATH_MAX, "/%s", dir);
+   dir = tilde_path;
+   }
if (*dir == '~') {
-   if (!user_path) {
-   logerror("'%s': User-path not allowed", dir);
-   return NULL;
-   }
if (*user_path) {
/* Got either "~alice" or "~alice/foo";
 * rewrite them to "~alice/%s" or
diff --git a/t/t5570-git-daemon.sh b/t/t5570-git-daemon.sh
index 225a022e8a..b6d2b9a001 100755
--- a/t/t5570-git-daemon.sh
+++ b/t/t5570-git-daemon.sh
@@ -65,6 +65,17 @@ test_expect_success 'remote detects correct HEAD' '
)
 '
 
+test_expect_success 'tilde is a normal character without --user-path' '
+   nouser="~nouser.git" &&
+   nouser_repo="$GIT_DAEMON_DOCUMENT_ROOT_PATH/$nouser" &&
+   mkdir "$nouser_repo" &&
+   git -C "$nouser_repo" --bare init &&
+   >"$nouser_repo/git-daemon-export-ok" &&
+   git push "$nouser_repo" master:master &&
+
+   git ls-remote "$GIT_DAEMON_URL/$nouser"
+'
+
 test_expect_success 'prepare pack objects' '
cp -R "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo.git 
"$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_pack.git &&
(cd "$GIT_DAEMON_DOCUMENT_ROOT_PATH"/repo_pack.git &&


Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths

2016-10-21 Thread Junio C Hamano
Duy Nguyen  writes:

> The amount of changes is unbelievable for fixing such a rare case
> though. I wonder if we can just detect this in daemon.c and pass
> "./~foo/bar" instead of "~foo/bar" to enter_repo() in non-strict mode
> to "disable" expand_user_path(). If it works, it's much simpler
> changes (even though a bit hacky)

Conceptually, it ought to be updating the code that says "Does it
begin with a tilde?  Then try to do user-path expansion and fail if
that is not enabled and if it succeeds use the resulting directory"
to "Is user-path enabled and does it begin with a tilde?  Then try
to do user-path expansion and if it succeeds use the resulting
directory".  Compared to that mental model we have with this
codepath, I agree that the change does look quite invasive and
large.

It is OK for a change to be large, as long as the end result is
easier to read and understand than the original.  I am undecided if
that is the case with this patch, though.

Also I am a bit worried about the change in the semantics.  While it
is not the best way to achieve this, the server operators could use
it as a way to add directories whose contents need to be hidden to
give them names that begin with a tilde without enabling user-path
expansion.  This change may be a new and useful feature for Luke,
but to these server operators this change can be a new bug---it is
probably a minor new bug as they can work it around by using other
means to hide these directories, though.


Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths

2016-10-19 Thread Duy Nguyen
On Wed, Oct 19, 2016 at 1:05 AM, Luke Shumaker  wrote:
>> I am not sure if it is even a bug.  As you can easily lose that
>> tilde that appears in front of subdirectory of /srv/git/ or replace
>> it with something else (e.g. "u/"), this smells like "Don't do it if
>> it hurts" thing to me.
>
> I buy into "Don't do it if it hurts", but that doesn't mean it's not a
> bug on an uncommon edge-case.

The amount of changes is unbelievable for fixing such a rare case
though. I wonder if we can just detect this in daemon.c and pass
"./~foo/bar" instead of "~foo/bar" to enter_repo() in non-strict mode
to "disable" expand_user_path(). If it works, it's much simpler
changes (even though a bit hacky)
-- 
Duy


Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths

2016-10-18 Thread Luke Shumaker
On Tue, 18 Oct 2016 13:08:45 -0400,
Junio C Hamano wrote:
> 
> Luke Shumaker  writes:
> 
> > The superficial aspect of this change is that git-daemon now allows paths
> > that start with a "~".  Previously, if git-daemon was run with
> > "--base-path=/srv/git", it was impossible to get it to serve
> > "/srv/git/~foo/bar.git".
> 
> I am not sure I understand what you are saying here.  Do you mean
> 
> I have a path on my server /srv/git/~foo/bar.git; the tilde does
> not mean anything special--it is just a byte in a valid pathname.
> 
> I want to allow my users to say
> 
>   git fetch git://my.server/~foo/bar.git
> 
> and fetch from that repository, but "git daemon" lacks the way
> to configure to allow it.

Yes, that is what I am saying.

> If that is the case, what happens instead?  Due to the leading
> "~foo/" getting noticed as an attempt to use the user-path expansion
> it is not treated as just a literal character?

What happens instead is

if (*dir == '~') {
if (!user_path) {
logerror("'%s': User-path not allowed", dir);
return NULL;
}

which to the user looks like

git clone git://my.server/~foo/bar.git
Cloning into 'bar'...
fatal: remote error: access denied or repository not exported: 
~foo/bar.git

> I am not sure if it is even a bug.  As you can easily lose that
> tilde that appears in front of subdirectory of /srv/git/ or replace
> it with something else (e.g. "u/"), this smells like "Don't do it if
> it hurts" thing to me.

I buy into "Don't do it if it hurts", but that doesn't mean it's not a
bug on an uncommon edge-case.  Note that it doesn't hurt with
git-shell or cgit (I haven't checked with gitweb).

Many programs (especially shell scripts) fail to deal with filenames
containing a space.  "Don't put spaces in filenames if it hurts".
It's still a bug in the program.

Similarly, `git gui` used to not be able to add a file in a directory
starting with '~' (when one clicked the file named "~foo/bar", it
said something along the lines of "/home/~foo/bar is outside
repository"), and one had to use `git add '~foo/bar` directly.
"Don't do it if it hurts"; it was still a bug.

  Aside: one (somewhat silly) non-user reason that I've seen for a
  directory to start with '~' is that it sorts after all other ASCII
  characters; it moves the directory to the end of any lists.

-- 
Happy hacking,
~ Luke Shumaker


Re: [PATCH] daemon, path.c: fix a bug with ~ in repo paths

2016-10-18 Thread Junio C Hamano
Luke Shumaker  writes:

> The superficial aspect of this change is that git-daemon now allows paths
> that start with a "~".  Previously, if git-daemon was run with
> "--base-path=/srv/git", it was impossible to get it to serve
> "/srv/git/~foo/bar.git".

I am not sure I understand what you are saying here.  Do you mean

I have a path on my server /srv/git/~foo/bar.git; the tilde does
not mean anything special--it is just a byte in a valid pathname.

I want to allow my users to say

git fetch git://my.server/~foo/bar.git

and fetch from that repository, but "git daemon" lacks the way
to configure to allow it.

If that is the case, what happens instead?  Due to the leading
"~foo/" getting noticed as an attempt to use the user-path expansion
it is not treated as just a literal character?

I am not sure if it is even a bug.  As you can easily lose that
tilde that appears in front of subdirectory of /srv/git/ or replace
it with something else (e.g. "u/"), this smells like "Don't do it if
it hurts" thing to me.



[PATCH] daemon, path.c: fix a bug with ~ in repo paths

2016-10-18 Thread Luke Shumaker
The superficial aspect of this change is that git-daemon now allows paths
that start with a "~".  Previously, if git-daemon was run with
"--base-path=/srv/git", it was impossible to get it to serve
"/srv/git/~foo/bar.git".  An odd edge-case that was broken.

But from a source-code standpoint, the change is in path.c:enter_repo().  I
have adjusted it to take separate "strict_prefix" and "strict_suffix"
arguments, rather than a single "strict" argument.

I also make it clearer what the purpose of each path buffer is for, by
renaming them to chdir_path and ret_path; chdir_path is the path that we
pass to chdir(); return_path is the path we return to the user.  Using this
nomenclature, we can more easily explain the behavior of the function.
There are 3 DWIM measures that enter_repo() provides: tilde expansion,
suffix guessing, and gitfile expansion; it also trims trailing slashes.
Here is how they are applied to each path:

+--+++
| Before this commit   | chdir_path | ret_path   |
+--+++
| trim trailing slashes| !strict| !strict|
| tilde expansion  | !strict| false  |
| suffix guessing  | !strict| !strict|
| gitfile expansion (< 2.6.3)  | !strict| false  |
| gitfile expansion (>= 2.6.3) | true   | strict |
+--+++
| With this commit | chdir_path | ret_path   |
+--+++
| trim trailing slashes| true   | true   |
| tilde expansion  | !strict_prefix | false  |
| suffix guessing  | !strict_suffix | !strict_suffix |
| gitfile expansion| true   | false  |
+--+++

The separation of "strict" into "strict_prefix" and "strict_suffix" is
necessary for git-daemon because it has separate --strict-paths (affects
prefix and suffix) and --user-path (just prefix) flags that can be toggled
separately.

In the other programs where enter_repo() is called, I continued the
existing behavior of tying the prefix and suffix strictness together
together; though I am not entirely sure that they should all be enabling
tilde expansion.  But for now, their behavior hasn't changed.

Signed-off-by: Luke Shumaker 
---
 builtin/receive-pack.c   |   2 +-
 builtin/upload-archive.c |   2 +-
 cache.h  |   2 +-
 daemon.c |  42 +++
 http-backend.c   |   2 +-
 path.c   | 135 +--
 upload-pack.c|   2 +-
 7 files changed, 96 insertions(+), 91 deletions(-)

diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
index 011db00..f430e96 100644
--- a/builtin/receive-pack.c
+++ b/builtin/receive-pack.c
@@ -1860,7 +1860,7 @@ int cmd_receive_pack(int argc, const char **argv, const 
char *prefix)
 
setup_path();
 
-   if (!enter_repo(service_dir, 0))
+   if (!enter_repo(service_dir, 0, 0))
die("'%s' does not appear to be a git repository", service_dir);
 
git_config(receive_pack_config, NULL);
diff --git a/builtin/upload-archive.c b/builtin/upload-archive.c
index 2caedf1..00d4ced 100644
--- a/builtin/upload-archive.c
+++ b/builtin/upload-archive.c
@@ -25,7 +25,7 @@ int cmd_upload_archive_writer(int argc, const char **argv, 
const char *prefix)
if (argc != 2)
usage(upload_archive_usage);
 
-   if (!enter_repo(argv[1], 0))
+   if (!enter_repo(argv[1], 0, 0))
die("'%s' does not appear to be a git repository", argv[1]);
 
/* put received options in sent_argv[] */
diff --git a/cache.h b/cache.h
index 4cba08e..6380be0 100644
--- a/cache.h
+++ b/cache.h
@@ -1024,7 +1024,7 @@ enum scld_error 
safe_create_leading_directories_const(const char *path);
 
 int mkdir_in_gitdir(const char *path);
 extern char *expand_user_path(const char *path);
-const char *enter_repo(const char *path, int strict);
+const char *enter_repo(const char *path, int strict_prefix, int strict_suffix);
 static inline int is_absolute_path(const char *path)
 {
return is_dir_sep(path[0]) || has_dos_drive_prefix(path);
diff --git a/daemon.c b/daemon.c
index 425aad0..118d337 100644
--- a/daemon.c
+++ b/daemon.c
@@ -170,27 +170,23 @@ static const char *path_ok(const char *directory, struct 
hostinfo *hi)
return NULL;
}
 
-   if (*dir == '~') {
-   if (!user_path) {
-   logerror("'%s': User-path not allowed", dir);
-   return NULL;
-   }
-   if (*user_path) {
-