Hi,

Brandon Williams wrote:

> Commit 0383bbb901 (submodule-config: verify submodule names as paths,
> 2018-04-30) introduced some checks to ensure that submodule names don't
> include directory traversal components (e.g. "../").
>
> This addresses the vulnerability identified in 0383bbb901 but the root
> cause is that we use submodule names to construct paths to the
> submodule's git directory.  What we really should do is munge the
> submodule name before using it to construct a path.
>
> Introduce a function "strbuf_submodule_gitdir()" which callers can use
> to build a path to a submodule's gitdir.  This allows for a single
> location where we can munge the submodule name (by url encoding it)
> before using it as part of a path.
>
> Signed-off-by: Brandon Williams <bmw...@google.com>
> ---
> Using submodule names as is continues to be not such a good idea.  Maybe
> we could apply something like this to stop using them as is.  url
> encoding seems like the easiest approach, but I've also heard
> suggestions that would could use the SHA1 of the submodule name.
>
> Any thoughts?

I like this idea.  It avoids the security and complexity problems of
funny nested directories, while still making the submodule git dirs
easy to find.

The current behavior has been particularly a problem in practice when
submodule names are nested:

        [submodule "a"]
                url = https://www.example.com/a
                path = a/1

        [submodule "a/b"]
                url = https://www.example.com/a/b
                path = a/2

We don't enforce any constraint on submodule names to prevent that,
but it causes hard to diagnose errors at clone time:

        fatal: not a git repository: superproject/a/1/../../.git/modules/a
        Unable to fetch in submodule path 'a/1'
        fatal: not a git repository: superproject/a/1/../../.git/modules/a
        fatal: not a git repository: superproject/a/1/../../.git/modules/a
        fatal: not a git repository: superproject/a/1/../../.git/modules/a
        Fetched in submodule 'a/1', but it did not contain 
55ca6286e3e4f4fba5d0448333fa99fc5a404a73. Direct fetching of that commit failed.

because the fetch in .git/modules/a is interfered with by
.git/modules/a/b.

[...]
> --- a/submodule.c
> +++ b/submodule.c
[...]
> @@ -1933,9 +1938,29 @@ int submodule_to_gitdir(struct strbuf *buf, const char 
> *submodule)
>                       goto cleanup;
>               }
>               strbuf_reset(buf);
> -             strbuf_git_path(buf, "%s/%s", "modules", sub->name);
> +             strbuf_submodule_gitdir(buf, the_repository, sub->name);
>       }
>  
>  cleanup:
>       return ret;
>  }
> +
> +void strbuf_submodule_gitdir(struct strbuf *buf, struct repository *r,
> +                          const char *submodule_name)
> +{
> +     int modules_len;

nit: size_t

> +
> +     strbuf_git_common_path(buf, r, "modules/");
> +     modules_len = buf->len;
> +     strbuf_addstr(buf, submodule_name);
> +
> +     /*
> +      * If the submodule gitdir already exists using the old location then
> +      * return that.
> +      */

nit: "old-fashioned location" or something.  Maybe the function could
use an API comment describing what's going on (that there are two
naming conventions and we try first the old, then the new).

Should we validate the submodule_name here when accessing following the old
convention?

> +     if (!access(buf->buf, F_OK))
> +             return;
> +
> +     strbuf_setlen(buf, modules_len);
> +     strbuf_addstr_urlencode(buf, submodule_name, 1);
> +}
[...]
> --- a/t/t7400-submodule-basic.sh
> +++ b/t/t7400-submodule-basic.sh
> @@ -932,7 +932,7 @@ test_expect_success 'recursive relative submodules stay 
> relative' '
>               cd clone2 &&
>               git submodule update --init --recursive &&
>               echo "gitdir: ../.git/modules/sub3" >./sub3/.git_expect &&
> -             echo "gitdir: ../../../.git/modules/sub3/modules/dirdir/subsub" 
> >./sub3/dirdir/subsub/.git_expect
> +             echo "gitdir: 
> ../../../.git/modules/sub3/modules/dirdir%2fsubsub" 
> >./sub3/dirdir/subsub/.git_expect
>       ) &&
>       test_cmp clone2/sub3/.git_expect clone2/sub3/.git &&
>       test_cmp clone2/sub3/dirdir/subsub/.git_expect 
> clone2/sub3/dirdir/subsub/.git

Sensible.

Can there be a test of the compatibility code as well?  (I mean a test
that manually sets up a submodule in .git/modules/dirdir/subsub and
ensures that it gets reused.)

I'll apply this, experiment with it, and report back.  Thanks for
writing it.

Sincerely,
Jonathan

Reply via email to