Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Brandon Williams
On 11/03, Jeff King wrote:
> On Thu, Nov 03, 2016 at 11:45:38AM -0700, Brandon Williams wrote:
> 
> > On 11/03, Jeff King wrote:
> > > 
> > > So this seems like a reasonable direction to me. It obviously needs
> > > documentation and tests. Arguably there should be a fallback "allow"
> > > value when a protocol is not mentioned in the config so that you could
> > > convert the default from "user" to "never" if you wanted your config to
> > > specify a pure whitelist.
> > 
> > Yes I agree there should probably be a fallback value of 'never' maybe?
> > What you currently have preserves the behavior of what git does
> > now, if we did instead have a fallback of 'never' it would break current
> > users who don't already use GIT_ALLOW_PROTOCOL (well only if they use
> > crazy protocols).  We could ease into it though and start with default
> > to allow and then transition to a true whitelist sometime after this
> > change has been made?
> 
> I don't see the value in moving the out-of-the-box install to any
> default except "user". Right now the experience of using a third-party
> helper is something like:
> 
>   cp git-remote-hg /somewhere/in/your/PATH
>   git clone hg::whatever
> 
> We restrict its use in submodules by default, which is unlikely to bite
> many people. But if we started falling back to "never" all the time,
> then that second command would break until you officially "approve"
> remote-hg in your config.
> 
> I was thinking of just something to let people decide to have that level
> of paranoia themselves (especially if they want to just set up a
> whole-system white list via the config without bothering with
> environment variables). Like:
> 
>   git config --system protocol.allow never
>   git config --system protocol.https.allow always
> 
> That behaves exactly like:
> 
>   export GIT_ALLOW_PROTOCOL=https
> 
> except it just works everywhere, without having to tweak the environment
> of every process.
> 

Ah ok, so essentially letting the user specify a default behaviour
themselves.

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Jeff King
On Thu, Nov 03, 2016 at 11:45:38AM -0700, Brandon Williams wrote:

> On 11/03, Jeff King wrote:
> > 
> > So this seems like a reasonable direction to me. It obviously needs
> > documentation and tests. Arguably there should be a fallback "allow"
> > value when a protocol is not mentioned in the config so that you could
> > convert the default from "user" to "never" if you wanted your config to
> > specify a pure whitelist.
> 
> Yes I agree there should probably be a fallback value of 'never' maybe?
> What you currently have preserves the behavior of what git does
> now, if we did instead have a fallback of 'never' it would break current
> users who don't already use GIT_ALLOW_PROTOCOL (well only if they use
> crazy protocols).  We could ease into it though and start with default
> to allow and then transition to a true whitelist sometime after this
> change has been made?

I don't see the value in moving the out-of-the-box install to any
default except "user". Right now the experience of using a third-party
helper is something like:

  cp git-remote-hg /somewhere/in/your/PATH
  git clone hg::whatever

We restrict its use in submodules by default, which is unlikely to bite
many people. But if we started falling back to "never" all the time,
then that second command would break until you officially "approve"
remote-hg in your config.

I was thinking of just something to let people decide to have that level
of paranoia themselves (especially if they want to just set up a
whole-system white list via the config without bothering with
environment variables). Like:

  git config --system protocol.allow never
  git config --system protocol.https.allow always

That behaves exactly like:

  export GIT_ALLOW_PROTOCOL=https

except it just works everywhere, without having to tweak the environment
of every process.

> > Do you have interest in picking this up and running with it?
> 
> Yep! Thanks for the help in shaping this.

Great, thanks. I'm happy to review or discuss further as necessary.

-Peff


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Brandon Williams
On 11/03, Jeff King wrote:
> 
> So this seems like a reasonable direction to me. It obviously needs
> documentation and tests. Arguably there should be a fallback "allow"
> value when a protocol is not mentioned in the config so that you could
> convert the default from "user" to "never" if you wanted your config to
> specify a pure whitelist.

Yes I agree there should probably be a fallback value of 'never' maybe?
What you currently have preserves the behavior of what git does
now, if we did instead have a fallback of 'never' it would break current
users who don't already use GIT_ALLOW_PROTOCOL (well only if they use
crazy protocols).  We could ease into it though and start with default
to allow and then transition to a true whitelist sometime after this
change has been made?

> 
> Without that, I think we'd want to keep GIT_ALLOW_PROTOCOL for the truly
> paranoid (though we should keep it indefinitely either way for backwards
> compatibility).
> 
> Do you have interest in picking this up and running with it?

Yep! Thanks for the help in shaping this.

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Jeff King
On Thu, Nov 03, 2016 at 11:19:54AM -0700, Brandon Williams wrote:

> On 11/03, Jeff King wrote:
> > +
> > +   /* unknown; let them be used only directly by the user */
> > +   return PROTOCOL_ALLOW_USER_ONLY;
> > +}
> > +
> >  int is_transport_allowed(const char *type)
> >  {
> > -   const struct string_list *allowed = protocol_whitelist();
> > -   return !allowed || string_list_has_string(allowed, type);
> > +   const struct string_list *whitelist = protocol_whitelist();
> > +   if (whitelist)
> > +   return string_list_has_string(whitelist, type);
> > +
> > +   switch (get_protocol_config(type)) {
> > +   case PROTOCOL_ALLOW_ALWAYS:
> > +   return 1;
> > +   case PROTOCOL_ALLOW_NEVER:
> > +   return 0;
> > +   case PROTOCOL_ALLOW_USER_ONLY:
> > +   return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
> > +   }
> 
> I know this is just a rough patch you wiped up but one question:
> With the 'user' state, how exactly do you envision this env variable
> working?  Do we want the user to have to explicitly set
> GIT_PROTOCOL_FROM_USER in their environment and then have these other
> commands (like git-submodule) explicitly clear the env var or would we
> rather these subcommands set a variable indicating they aren't coming
> from the user and the deafult state (no var set) is a user run command?

See the follow-up I just posted, but basically, the rules are:

  - if you don't say anything, then the URL is from the user

  - git-submodule would set it to "0" (i.e., tell us to be more careful)

  - tools like "go get" would similarly set it to "0" if they are
passing untrusted URLs

-Peff


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Jeff King
On Thu, Nov 03, 2016 at 01:53:27PM -0400, Jeff King wrote:

> I'd design the new system from scratch, and have it kick in _only_ when
> GIT_ALLOW_PROTOCOL is not set. That lets existing callers continue to
> have the safe behavior until they are ready to move to the new format.
> 
> Something like the patch below (which is just for illustration, and not
> tested beyond compilation).

Here's that same patch with a few tweaks:

  - it changes git-submodule to use the new, more flexible system (which
also gets a it a lot more test coverage)

  - it tweaks two tests which use the "ext" helper to enable it (since
it's blacklisted by default; I have mixed feelings on that, but I
see why Blake wants it, as it would have protected things like "go
get" out of the box).

  - it adds "file://" as a known-good protocol, even for submodules,
which matches the current code.  I am not sure if this is reasonable
or not. A malicious repository probably can't do much by pointing
you to cloning your own repo as a submodule unless you then _also_
run some arbitrary code to expose it, at which point it's generally
game-over anyway.

And I'd expect automated services (like GitHub Pages) to already
have a cut-down whitelist via GIT_ALLOW_PROTOCOL (and I happen to
know that it goes).

So this seems like a reasonable direction to me. It obviously needs
documentation and tests. Arguably there should be a fallback "allow"
value when a protocol is not mentioned in the config so that you could
convert the default from "user" to "never" if you wanted your config to
specify a pure whitelist.

Without that, I think we'd want to keep GIT_ALLOW_PROTOCOL for the truly
paranoid (though we should keep it indefinitely either way for backwards
compatibility).

Do you have interest in picking this up and running with it?

-Peff

diff --git a/git-submodule.sh b/git-submodule.sh
index a024a135d..0a477b4c9 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -21,14 +21,10 @@ require_work_tree
 wt_prefix=$(git rev-parse --show-prefix)
 cd_to_toplevel
 
-# Restrict ourselves to a vanilla subset of protocols; the URLs
-# we get are under control of a remote repository, and we do not
-# want them kicking off arbitrary git-remote-* programs.
-#
-# If the user has already specified a set of allowed protocols,
-# we assume they know what they're doing and use that instead.
-: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
-export GIT_ALLOW_PROTOCOL
+# Tell the rest of git that any URLs we get don't come
+# directly from the user, so it can apply policy as appropriate.
+GIT_PROTOCOL_FROM_USER=0
+export GIT_PROTOCOL_FROM_USER
 
 command=
 branch=
diff --git a/t/t5509-fetch-push-namespaces.sh b/t/t5509-fetch-push-namespaces.sh
index bc44ac36d..75c570adc 100755
--- a/t/t5509-fetch-push-namespaces.sh
+++ b/t/t5509-fetch-push-namespaces.sh
@@ -4,6 +4,7 @@ test_description='fetch/push involving ref namespaces'
 . ./test-lib.sh
 
 test_expect_success setup '
+   git config --global protocol.ext.allow user &&
test_tick &&
git init original &&
(
diff --git a/t/t5802-connect-helper.sh b/t/t5802-connect-helper.sh
index b7a7f9d58..c6c266187 100755
--- a/t/t5802-connect-helper.sh
+++ b/t/t5802-connect-helper.sh
@@ -4,6 +4,7 @@ test_description='ext::cmd remote "connect" helper'
 . ./test-lib.sh
 
 test_expect_success setup '
+   git config --global protocol.ext.allow user &&
test_tick &&
git commit --allow-empty -m initial &&
test_tick &&
diff --git a/transport.c b/transport.c
index d57e8dec2..cd603fbf5 100644
--- a/transport.c
+++ b/transport.c
@@ -664,10 +664,70 @@ static const struct string_list *protocol_whitelist(void)
return enabled ? &allowed : NULL;
 }
 
+enum protocol_allow_config {
+   PROTOCOL_ALLOW_NEVER = 0,
+   PROTOCOL_ALLOW_USER_ONLY,
+   PROTOCOL_ALLOW_ALWAYS
+};
+
+static enum protocol_allow_config parse_protocol_config(const char *key,
+   const char *value)
+{
+   if (!strcasecmp(value, "always"))
+   return PROTOCOL_ALLOW_ALWAYS;
+   else if (!strcasecmp(value, "never"))
+   return PROTOCOL_ALLOW_NEVER;
+   else if (!strcasecmp(value, "user"))
+   return PROTOCOL_ALLOW_USER_ONLY;
+
+   /* XXX maybe also interpret git_config_bool() here? */
+   die("unknown value for config '%s': %s", key, value);
+}
+
+static enum protocol_allow_config get_protocol_config(const char *type)
+{
+   char *key = xstrfmt("protocol.%s.allow", type);
+   char *value;
+
+   if (!git_config_get_string(key, &value)) {
+   enum protocol_allow_config ret =
+   parse_protocol_config(key, value);
+   free(key);
+   free(value);
+   return ret;
+   }
+   free(key);
+
+   /* known safe */
+   if (!strcmp(type, "http") ||
+   !strcmp(type, "https") |

Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Brandon Williams
On 11/03, Jeff King wrote:
> +
> + /* unknown; let them be used only directly by the user */
> + return PROTOCOL_ALLOW_USER_ONLY;
> +}
> +
>  int is_transport_allowed(const char *type)
>  {
> - const struct string_list *allowed = protocol_whitelist();
> - return !allowed || string_list_has_string(allowed, type);
> + const struct string_list *whitelist = protocol_whitelist();
> + if (whitelist)
> + return string_list_has_string(whitelist, type);
> +
> + switch (get_protocol_config(type)) {
> + case PROTOCOL_ALLOW_ALWAYS:
> + return 1;
> + case PROTOCOL_ALLOW_NEVER:
> + return 0;
> + case PROTOCOL_ALLOW_USER_ONLY:
> + return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
> + }

I know this is just a rough patch you wiped up but one question:
With the 'user' state, how exactly do you envision this env variable
working?  Do we want the user to have to explicitly set
GIT_PROTOCOL_FROM_USER in their environment and then have these other
commands (like git-submodule) explicitly clear the env var or would we
rather these subcommands set a variable indicating they aren't coming
from the user and the deafult state (no var set) is a user run command?

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Brandon Williams
On 11/03, Jeff King wrote:
> On Thu, Nov 03, 2016 at 10:51:31AM -0700, Brandon Williams wrote:
> 
> > > > I don't know if I'm sold on a 'user' state just yet, perhaps that's just
> > > > because I view a whitelist or blacklist as well black and white and
> > > > having this user state adds in a gray area.
> > > 
> > > Well the "user" state is to differentiate between the
> > > * "I consciously typed `git clone ...` (and e.g. I know what happens as
> > >   I know the server admin and they are trustworthy.)
> > > * a repository contains a possible hostile .gitmodules file such
> > >   that I am not aware of the network connection.
> > 
> > This is still a gray area to me.  I think that if we have a whitelist of
> > protocols then it should be a true whitelist and not have some means of
> > going around it.  It just seems like something that could be exploited.
> 
> How do you implement:
> 
>   git clone --recursive trusted:foo.git
> 
> and use your ssh keys for the "trusted" server, but not for any servers
> mentioned in .gitmodules?
> 
> You need some way of distinguishing between the two contexts (and
> setting policy for each).
> 
> -Peff

Interesting.  Ok I can see how this would be a useful now.  Thanks for
the example :)

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Jeff King
On Thu, Nov 03, 2016 at 10:51:31AM -0700, Brandon Williams wrote:

> > > I don't know if I'm sold on a 'user' state just yet, perhaps that's just
> > > because I view a whitelist or blacklist as well black and white and
> > > having this user state adds in a gray area.
> > 
> > Well the "user" state is to differentiate between the
> > * "I consciously typed `git clone ...` (and e.g. I know what happens as
> >   I know the server admin and they are trustworthy.)
> > * a repository contains a possible hostile .gitmodules file such
> >   that I am not aware of the network connection.
> 
> This is still a gray area to me.  I think that if we have a whitelist of
> protocols then it should be a true whitelist and not have some means of
> going around it.  It just seems like something that could be exploited.

How do you implement:

  git clone --recursive trusted:foo.git

and use your ssh keys for the "trusted" server, but not for any servers
mentioned in .gitmodules?

You need some way of distinguishing between the two contexts (and
setting policy for each).

-Peff


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Jeff King
On Thu, Nov 03, 2016 at 10:39:35AM -0700, Stefan Beller wrote:

> >>   protocol.X.allow = always | user | never
> >
> > It sounds like there is interest for this sort of behavior, it would
> > definitely require a larger change than what I initially proposed.  One
> > problem I see though is that with this we have support for both a
> > blacklist and a whitelist.  Which wins?
> 
> For the submodule operations we'll use a whitelist, because we want to
> provide security and for the other case we can offer a blacklist as a bandaid.
> 
> My opinion on blacklists is roughly aligned with e.g. :
> https://blog.codinghorror.com/blacklists-dont-work/
> http://blog.deepinstinct.com/2016/02/04/when-blacklists-dont-really-work/
> 
> So IMHO we could drop the "never" and substitute it with a "warn" or
> "ask-user", such that this configuration becomes a white list for both cases:
> 
>  protocol.X.allow = always | user | warn

I don't think blacklists work in the general case, because they grow out
of date and fail-open. But you want to have _some_ blacklisting
mechanism, in order override a decision of the whitelist.

For instance, the default submodule whitelist would probably include
https and ssh. But if I'm cloning potentially malicious repos and I
don't ever want them to trigger ssh (because I don't want them to use my
ssh keys, whereas I have explicitly set up my credentials such that http
is safe to use), I would want to be able to do:

  git config protocol.ssh.allow never

(or "git -c", or whatever).

True, a whitelist is safer. If we add a new "foo" protocol that also
looks at your ssh keys, you're screwed. And that's why I designed it as
a pure-whitelist in the first place. But it comes at the price of
convenience, because you have to manually add each new innocent protocol
to the whitelist.

> So you're suggesting that setting it to "never" doesn't have any effect
> except for cluttering the config file?
> I don't think we should do that; each setting should have an impact.
> So maybe the "never" would be there to disallow protocols of the hardcoded
> white list (e.g. http)

Exactly.

> > I don't know if I'm sold on a 'user' state just yet, perhaps that's just
> > because I view a whitelist or blacklist as well black and white and
> > having this user state adds in a gray area.
> 
> Well the "user" state is to differentiate between the
> * "I consciously typed `git clone ...` (and e.g. I know what happens as
>   I know the server admin and they are trustworthy.)
> * a repository contains a possible hostile .gitmodules file such
>   that I am not aware of the network connection.

Right. I had assumed that we would tell the difference between those
automatically (by seeing if we got the URL on the command line), but
things like "go get" show that the context is often before git is even
called.

-Peff


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Jeff King
On Thu, Nov 03, 2016 at 10:25:15AM -0700, Brandon Williams wrote:

> > So I think this probably needs to be a separate parallel system where
> > each protocol can be white- or black-listed in a context-specific way.
> > Like:
> > 
> >   protocol.X.allow = always | user | never
> 
> It sounds like there is interest for this sort of behavior, it would
> definitely require a larger change than what I initially proposed.  One
> problem I see though is that with this we have support for both a
> blacklist and a whitelist.  Which wins?  Or do we simply generate a
> whitelist of allowed protocols which includes all protocols with allow
> set to 'always' and if it is set to 'never' then it just isn't included
> in the whitelist?

I think trying to combine the two or generate the whitelist from the
more flexible format is a recipe for madness.

I'd design the new system from scratch, and have it kick in _only_ when
GIT_ALLOW_PROTOCOL is not set. That lets existing callers continue to
have the safe behavior until they are ready to move to the new format.

Something like the patch below (which is just for illustration, and not
tested beyond compilation).

> I don't know if I'm sold on a 'user' state just yet, perhaps that's just
> because I view a whitelist or blacklist as well black and white and
> having this user state adds in a gray area.

The lack of understanding the "user" context is what makes the current
system so painful. The only way a caller can influence the system is to
hand over the policy directly: this is allowed, this is not. But systems
like "go get" should not be setting policy. They should be giving us a
hint about the context, and letting git implement the policy.

-- >8 --
diff --git a/transport.c b/transport.c
index d57e8dec2..0d5b382db 100644
--- a/transport.c
+++ b/transport.c
@@ -664,10 +664,69 @@ static const struct string_list *protocol_whitelist(void)
return enabled ? &allowed : NULL;
 }
 
+enum protocol_allow_config {
+   PROTOCOL_ALLOW_NEVER,
+   PROTOCOL_ALLOW_USER_ONLY,
+   PROTOCOL_ALLOW_ALWAYS
+};
+
+static enum protocol_allow_config parse_protocol_config(const char *key,
+   const char *value)
+{
+   if (!strcasecmp(value, "always"))
+   return PROTOCOL_ALLOW_ALWAYS;
+   else if (!strcasecmp(value, "never"))
+   return PROTOCOL_ALLOW_NEVER;
+   else if (!strcasecmp(value, "user"))
+   return PROTOCOL_ALLOW_USER_ONLY;
+
+   /* XXX maybe also interpret git_config_bool() here? */
+   die("unknown value for config '%s': %s", key, value);
+}
+
+static enum protocol_allow_config get_protocol_config(const char *type)
+{
+   char *key = xstrfmt("protocol.%s.allow", type);
+   char *value;
+
+   if (!git_config_get_string(key, &value)) {
+   enum protocol_allow_config ret =
+   parse_protocol_config(key, value);
+   free(key);
+   free(value);
+   return ret;
+   }
+   free(key);
+
+   /* known safe */
+   if (!strcmp(type, "http") ||
+   !strcmp(type, "https") ||
+   !strcmp(type, "git") ||
+   !strcmp(type, "ssh"))
+   return PROTOCOL_ALLOW_ALWAYS;
+
+   /* known scary; err on the side of caution */
+   if (!strcmp(type, "ext"))
+   return PROTOCOL_ALLOW_NEVER;
+
+   /* unknown; let them be used only directly by the user */
+   return PROTOCOL_ALLOW_USER_ONLY;
+}
+
 int is_transport_allowed(const char *type)
 {
-   const struct string_list *allowed = protocol_whitelist();
-   return !allowed || string_list_has_string(allowed, type);
+   const struct string_list *whitelist = protocol_whitelist();
+   if (whitelist)
+   return string_list_has_string(whitelist, type);
+
+   switch (get_protocol_config(type)) {
+   case PROTOCOL_ALLOW_ALWAYS:
+   return 1;
+   case PROTOCOL_ALLOW_NEVER:
+   return 0;
+   case PROTOCOL_ALLOW_USER_ONLY:
+   return git_env_bool("GIT_PROTOCOL_FROM_USER", 1);
+   }
 }
 
 void transport_check_allowed(const char *type)


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Brandon Williams
On 11/03, Stefan Beller wrote:
> >>   protocol.X.allow = always | user | never
> >
> > It sounds like there is interest for this sort of behavior, it would
> > definitely require a larger change than what I initially proposed.  One
> > problem I see though is that with this we have support for both a
> > blacklist and a whitelist.  Which wins?
> 
> For the submodule operations we'll use a whitelist, because we want to
> provide security and for the other case we can offer a blacklist as a bandaid.
> 
> My opinion on blacklists is roughly aligned with e.g. :
> https://blog.codinghorror.com/blacklists-dont-work/
> http://blog.deepinstinct.com/2016/02/04/when-blacklists-dont-really-work/
> 
> So IMHO we could drop the "never" and substitute it with a "warn" or
> "ask-user", such that this configuration becomes a white list for both cases:
> 
>  protocol.X.allow = always | user | warn
> 
> > Or do we simply generate a
> > whitelist of allowed protocols which includes all protocols with allow
> > set to 'always' and if it is set to 'never' then it just isn't included
> > in the whitelist?
> 
> So you're suggesting that setting it to "never" doesn't have any effect
> except for cluttering the config file?
> I don't think we should do that; each setting should have an impact.
> So maybe the "never" would be there to disallow protocols of the hardcoded
> white list (e.g. http)

Thats what I meant, if a protocol is listed as 'never' then it just
removes that protocol from the whitelist.  That way we still have the
benefit of using a whitelist vs a blacklist.  Also, if we move in this
direction should we setup a default whitelist of allowed protocols?

> >
> > I don't know if I'm sold on a 'user' state just yet, perhaps that's just
> > because I view a whitelist or blacklist as well black and white and
> > having this user state adds in a gray area.
> 
> Well the "user" state is to differentiate between the
> * "I consciously typed `git clone ...` (and e.g. I know what happens as
>   I know the server admin and they are trustworthy.)
> * a repository contains a possible hostile .gitmodules file such
>   that I am not aware of the network connection.

This is still a gray area to me.  I think that if we have a whitelist of
protocols then it should be a true whitelist and not have some means of
going around it.  It just seems like something that could be exploited.

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Stefan Beller
>>   protocol.X.allow = always | user | never
>
> It sounds like there is interest for this sort of behavior, it would
> definitely require a larger change than what I initially proposed.  One
> problem I see though is that with this we have support for both a
> blacklist and a whitelist.  Which wins?

For the submodule operations we'll use a whitelist, because we want to
provide security and for the other case we can offer a blacklist as a bandaid.

My opinion on blacklists is roughly aligned with e.g. :
https://blog.codinghorror.com/blacklists-dont-work/
http://blog.deepinstinct.com/2016/02/04/when-blacklists-dont-really-work/

So IMHO we could drop the "never" and substitute it with a "warn" or
"ask-user", such that this configuration becomes a white list for both cases:

 protocol.X.allow = always | user | warn

> Or do we simply generate a
> whitelist of allowed protocols which includes all protocols with allow
> set to 'always' and if it is set to 'never' then it just isn't included
> in the whitelist?

So you're suggesting that setting it to "never" doesn't have any effect
except for cluttering the config file?
I don't think we should do that; each setting should have an impact.
So maybe the "never" would be there to disallow protocols of the hardcoded
white list (e.g. http)

>
> I don't know if I'm sold on a 'user' state just yet, perhaps that's just
> because I view a whitelist or blacklist as well black and white and
> having this user state adds in a gray area.

Well the "user" state is to differentiate between the
* "I consciously typed `git clone ...` (and e.g. I know what happens as
  I know the server admin and they are trustworthy.)
* a repository contains a possible hostile .gitmodules file such
  that I am not aware of the network connection.


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Brandon Williams
On 11/03, Jeff King wrote:
> On Wed, Nov 02, 2016 at 05:22:25PM -0700, Jonathan Nieder wrote:
> 
> > Another difficulty with setting GIT_ALLOW_PROTOCOL globally is that it
> > requires copy/pasting the default value from upstream and then adding
> > the values I want.  There's no straightforward way to get the current
> > value and add to it, in case I want to benefit from future upstream
> > fixes to the default list.
> 
> I agree that this is a big drawback of the current scheme, and it would
> be nice to be able to say "also allow".
> 
> > That is, would it be possible to use something like
> > 
> > [protocol "sso"]
> > allow = always
> > 
> > instead of
> > 
> > [core]
> > allowProtocol = file:git:http:https::sso
> > 
> > ?
> 
> One complication is that the whitelist has multiple states:
> 
>   1. if it's not used at all, anything goes
> 
>   2. if it exists and has zero or more entries, only those entries are
>  allowed
> 
> And then submodules are an exception to (1), because it's not anything
> goes. It's "this default safe whitelist".
> 
> So when does protocol.sso.allow kick in? We wouldn't want it to trigger
> case (2) for things like fetch (disabling other non-allowed protocols).
> Nor do I think we'd only want it for the submodule case, as I would
> assume that "protocol.sso.allow = false" should disable it.
> 
> So I think this probably needs to be a separate parallel system where
> each protocol can be white- or black-listed in a context-specific way.
> Like:
> 
>   protocol.X.allow = always | user | never

It sounds like there is interest for this sort of behavior, it would
definitely require a larger change than what I initially proposed.  One
problem I see though is that with this we have support for both a
blacklist and a whitelist.  Which wins?  Or do we simply generate a
whitelist of allowed protocols which includes all protocols with allow
set to 'always' and if it is set to 'never' then it just isn't included
in the whitelist?

I don't know if I'm sold on a 'user' state just yet, perhaps that's just
because I view a whitelist or blacklist as well black and white and
having this user state adds in a gray area.

> > An example approach would be a GIT_ALLOW_PROTOCOL var returned by
> > "git var".
> > 
> > That way git-submodule.sh could do
> > 
> > : ${GIT_ALLOW_PROTOCOL=$(git var GIT_ALLOW_PROTOCOL)}
> > 
> > and it would just work.  Other tools could do the same, with a
> > fallback to the current default until new enough git is in widespread
> > use.
> 
> ...some automated way to say "is this protocol supported"? I think it is
> not just "give me ALLOW_PROTOCOL" anymore, though, but "apply your rules
> to this protocol, and tell me if it is supported".

I agree, if we do add different states to a protocol then we couldn't
simply ask for a whitelist/blacklist of protocols anymore since its more
of a graylist :) (if such a thing exits). 

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-03 Thread Jeff King
On Wed, Nov 02, 2016 at 05:22:25PM -0700, Jonathan Nieder wrote:

> Another difficulty with setting GIT_ALLOW_PROTOCOL globally is that it
> requires copy/pasting the default value from upstream and then adding
> the values I want.  There's no straightforward way to get the current
> value and add to it, in case I want to benefit from future upstream
> fixes to the default list.

I agree that this is a big drawback of the current scheme, and it would
be nice to be able to say "also allow".

> That is, would it be possible to use something like
> 
>   [protocol "sso"]
>   allow = always
> 
> instead of
> 
>   [core]
>   allowProtocol = file:git:http:https::sso
> 
> ?

One complication is that the whitelist has multiple states:

  1. if it's not used at all, anything goes

  2. if it exists and has zero or more entries, only those entries are
 allowed

And then submodules are an exception to (1), because it's not anything
goes. It's "this default safe whitelist".

So when does protocol.sso.allow kick in? We wouldn't want it to trigger
case (2) for things like fetch (disabling other non-allowed protocols).
Nor do I think we'd only want it for the submodule case, as I would
assume that "protocol.sso.allow = false" should disable it.

So I think this probably needs to be a separate parallel system where
each protocol can be white- or black-listed in a context-specific way.
Like:

  protocol.X.allow = always | user | never

Where "user" specifies that the protocol is OK coming directly from the
user, but not from other sources.  And we default known-common-and-good
ones like protocol.http.allow to "always", unknown ones (like "foo"
which runs "remote-foo") to "user"), and possibly known-scary ones like
"ext") to "never".

Then we need some way of telling git "you are in a context where the URL
parameter is not coming from the user". Probably via the environment in
GIT_PROTOCOL_FROM_USER or similar. Which git-submodule would sent when
recursing clones, along with things like "go get".

In other words, stop asking git-submodule or "go get" to specify policy,
and let them specify context that can be used to implement policy that
the user specifies (and have git provide a sane default policy).

I think this would all take a backseat to GIT_ALLOW_PROTOCOL, for
backwards compatibility, and then GIT_ALLOW_PROTOCOL could slowly die
off over time.

> That reminds me: external tools also set GIT_ALLOW_PROTOCOL when the
> user hasn't set it explicitly, like git-submodule.sh does.  E.g.
> repo ,
> mercurial 
> .
> Other external tools consume GIT_ALLOW_PROTOCOL, like 'go get'
> .
> Can we make it more convenient for them to support this configuration
> too?

I think under my proposal above this ugliness just goes away, as all
they have to do is say "trust my URLs less; they come from an automated
source" without specifying policy themselves.

But we may still want...

> An example approach would be a GIT_ALLOW_PROTOCOL var returned by
> "git var".
> 
> That way git-submodule.sh could do
> 
>   : ${GIT_ALLOW_PROTOCOL=$(git var GIT_ALLOW_PROTOCOL)}
> 
> and it would just work.  Other tools could do the same, with a
> fallback to the current default until new enough git is in widespread
> use.

...some automated way to say "is this protocol supported"? I think it is
not just "give me ALLOW_PROTOCOL" anymore, though, but "apply your rules
to this protocol, and tell me if it is supported".

I don't think things like "go get" would need it, but you would if you
had a porcelain built around git that was accessing a URL _not_ via
git-fetch, but wanted to apply git's rules. That could come as a step 2
later, though.

-Peff


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Junio C Hamano
Jonathan Nieder  writes:

> That is, would it be possible to use something like
>
>   [protocol "sso"]
>   allow = always
>
> instead of
>
>   [core]
>   allowProtocol = file:git:http:https::sso
>
> ?

That sounds like quite a large usability improvement to me.



Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Blake Burkhart
Thanks for CCing me.

I haven't looked at this implementation in detail, but it would be
good to move this configuration into the config system because I think
we can more easily provide a default safe configuration.

It would be nice to use this to introduce a default list of
whitelisted protocols that even applies to `git clone`. I strongly
think we need to find a way to have git-remote-ext disabled by
default. This could be a way to do it.

On Wed, Nov 2, 2016 at 7:22 PM, Jonathan Nieder  wrote:
> That reminds me: external tools also set GIT_ALLOW_PROTOCOL when the
> user hasn't set it explicitly, like git-submodule.sh does.  E.g.
> repo ,
> mercurial 
> .
> Other external tools consume GIT_ALLOW_PROTOCOL, like 'go get'
> .
> Can we make it more convenient for them to support this configuration
> too?

Most of these are my fault too. I encouraged git-repo and mercurial to
use GIT_ALLOW_PROTOCOL to avoid security issues from git-remote-ext.

-- 
Blake Burkhart


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Jonathan Nieder
(+peff and bburky, who introduced GIT_ALLOW_PROTOCOL)
Brandon Williams wrote:

> Add configuration option 'core.allowProtocol' to allow users to create a
> whitelist of allowed protocols for fetch/push/clone in their gitconfig.

Ooh.

This would be especially useful at $DAYJOB, where there is a custom
sso:// protocol that is often used by submodules.  Using an envvar to
whitelist it globally is painful because

 - it disables other protocols even when explicitly requested on a
   plain "git clone" command line by the user.  By comparison, the
   built-in git-submodule.sh whitelist only applies to submodules.

 - platform-specific instructions to set an environment variable can
   be more difficult than "just set this git configuration"

Another difficulty with setting GIT_ALLOW_PROTOCOL globally is that it
requires copy/pasting the default value from upstream and then adding
the values I want.  There's no straightforward way to get the current
value and add to it, in case I want to benefit from future upstream
fixes to the default list.

That is, would it be possible to use something like

[protocol "sso"]
allow = always

instead of

[core]
allowProtocol = file:git:http:https::sso

?

[...]
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -27,7 +27,8 @@ cd_to_toplevel
>  #
>  # If the user has already specified a set of allowed protocols,
>  # we assume they know what they're doing and use that instead.
> -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> +config_whitelist=$(git config core.allowProtocol)
> +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}

optional: To avoid config parsing when GIT_ALLOW_PROTOCOL is already
set, could do something like

 if ! test "${GIT_ALLOW_PROTOCOL+set}"
 then
GIT_ALLOW_PROTOCOL=$(
git config --name-only --get-regexp 'protocol\..*\.allow' 
always |
sed -e 's/^protocol.//' -e 's/.allow$//' |
tr '\n' ':'
)
GIT_ALLOW_PROTOCOL=${GIT_ALLOW_PROTOCOL%:}
: ${GIT_ALLOW_PROTOCOL:=file:git:http:https:ssh}
 fi

[...]
> --- a/transport.c
> +++ b/transport.c
> @@ -652,7 +652,7 @@ static const struct string_list *protocol_whitelist(void)
>  
>   if (enabled < 0) {
>   const char *v = getenv("GIT_ALLOW_PROTOCOL");
> - if (v) {
> + if (v || !git_config_get_value("core.allowProtocol", &v)) {
>   string_list_split(&allowed, v, ':', -1);

This has the effect of always disabling other protocols when
core.allowProtocol is set.  Is that intended?

Like the default list used by submodule, I'd be happiest if this only
applied to repositories cloned implicitly instead of those passed
directly to 'git clone'.

That reminds me: external tools also set GIT_ALLOW_PROTOCOL when the
user hasn't set it explicitly, like git-submodule.sh does.  E.g.
repo ,
mercurial 
.
Other external tools consume GIT_ALLOW_PROTOCOL, like 'go get'
.
Can we make it more convenient for them to support this configuration
too?

An example approach would be a GIT_ALLOW_PROTOCOL var returned by
"git var".

That way git-submodule.sh could do

: ${GIT_ALLOW_PROTOCOL=$(git var GIT_ALLOW_PROTOCOL)}

and it would just work.  Other tools could do the same, with a
fallback to the current default until new enough git is in widespread
use.

Thanks and hope that helps,
Jonathan


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Jeff King
On Wed, Nov 02, 2016 at 04:46:13PM -0700, Brandon Williams wrote:

> > > I thought at first we'd have to deal with leaking "v", but "get_value"
> > > is the "raw" version that gives you the uninterpreted value. I think
> > > that means it may give you NULL, though if we see an implicit bool like:
> > > 
> > >   [core]
> > >   allowProtocol
> > > 
> > > That's nonsense, of course, but we would still segfault. I
> > > think the easiest way to test is:
> > > 
> > >   git -c core.allowProtocol fetch
> > > 
> > > which seems to segfault for me with this patch.
> > 
> > what is the desired behavior when a user provides a config in a way that
> > isn't intended?
> 
> oh...I can just drop in git_config_get_string_const() instead.

Yes, it will call git_config_string(), which will make sure there's an
actual value and die otherwise. But note that it also duplicates the
string, so you'd have to deal with freeing it.

-Peff


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Brandon Williams
> > > diff --git a/transport.c b/transport.c
> > > index d57e8de..b1098cd 100644
> > > --- a/transport.c
> > > +++ b/transport.c
> > > @@ -652,7 +652,7 @@ static const struct string_list 
> > > *protocol_whitelist(void)
> > >  
> > >   if (enabled < 0) {
> > >   const char *v = getenv("GIT_ALLOW_PROTOCOL");
> > > - if (v) {
> > > + if (v || !git_config_get_value("core.allowProtocol", &v)) {
> > >   string_list_split(&allowed, v, ':', -1);
> > >   string_list_sort(&allowed);
> > >   enabled = 1;
> > 
> > I thought at first we'd have to deal with leaking "v", but "get_value"
> > is the "raw" version that gives you the uninterpreted value. I think
> > that means it may give you NULL, though if we see an implicit bool like:
> > 
> >   [core]
> >   allowProtocol
> > 
> > That's nonsense, of course, but we would still segfault. I
> > think the easiest way to test is:
> > 
> >   git -c core.allowProtocol fetch
> > 
> > which seems to segfault for me with this patch.
> 
> what is the desired behavior when a user provides a config in a way that
> isn't intended?

oh...I can just drop in git_config_get_string_const() instead.

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Brandon Williams
On 11/02, Jeff King wrote:
> On Wed, Nov 02, 2016 at 07:05:39PM -0400, Jeff King wrote:
> 
> > > +core.allowProtocol::
> > > + Provide a colon-separated list of protocols which are allowed to be
> > > + used with fetch/push/clone. This is useful to restrict recursive
> > > + submodule initialization from an untrusted repository. Any protocol not
> > > + mentioned will be disallowed (i.e., this is a whitelist, not a
> > > + blacklist). If the variable is not set at all, all protocols are
> > > + enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
> > > + used as the protocol whitelist instead of this config option.
> > 
> > The "not set at all, all protocols are enabled" bit is not quite
> > correct, is it? It is true for a top-level fetch, but not for submodule
> > recursion (and especially since you are talking about submodule
> > recursion immediately before, it is rather confusing).
> 
> Heh, just saw that you copied this straight from the discussion of
> GIT_ALLOW_PROTOCOL. What idiot wrote the original? :)
> 
> It might be worth fixing both places (or possibly just fixing the
> original and phrasing this one as "If GIT_ALLOW_PROTOCOL is not set, use
> this as the default value; see git(1) for details").
> 
> -Peff

haha K I'll fix the original as well.

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Brandon Williams
On 11/02, Jeff King wrote:
> On Wed, Nov 02, 2016 at 03:20:47PM -0700, Brandon Williams wrote:
> 
> > Add configuration option 'core.allowProtocol' to allow users to create a
> > whitelist of allowed protocols for fetch/push/clone in their gitconfig.
> > 
> > For git-submodule.sh, fallback to default whitelist only if the user
> > hasn't explicitly set `GIT_ALLOW_PROTOCOL` or doesn't have a whitelist
> > in their gitconfig.
> 
> This says "what", but not "why". What's the use case?
> 
> I can see somebody wanting to pare down the whitelist further (e.g.,
> because they are carrying ssh credentials that they don't want to use on
> behalf of a malicious repo). But in general I'd expect this setting to
> be a function of the environment you're operating in, and not the
> on-disk config.
> 
> Or is the intent to broaden it for cases where you have a clone that
> uses some non-standard protocol, and you want it to Just Work on
> subsequent recursive fetches?
> 
> > +core.allowProtocol::
> > +   Provide a colon-separated list of protocols which are allowed to be
> > +   used with fetch/push/clone. This is useful to restrict recursive
> > +   submodule initialization from an untrusted repository. Any protocol not
> > +   mentioned will be disallowed (i.e., this is a whitelist, not a
> > +   blacklist). If the variable is not set at all, all protocols are
> > +   enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
> > +   used as the protocol whitelist instead of this config option.
> 
> The "not set at all, all protocols are enabled" bit is not quite
> correct, is it? It is true for a top-level fetch, but not for submodule
> recursion (and especially since you are talking about submodule
> recursion immediately before, it is rather confusing).

Yeah stefan mentioned this to me.  I simply copied the documentaion from
GIT_ALLOW_PROTOCOL, perhaps that should be updated as well?

> 
> > --- a/git-submodule.sh
> > +++ b/git-submodule.sh
> > @@ -27,7 +27,8 @@ cd_to_toplevel
> >  #
> >  # If the user has already specified a set of allowed protocols,
> >  # we assume they know what they're doing and use that instead.
> > -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> > +config_whitelist=$(git config core.allowProtocol)
> > +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}
> 
> The original uses "=" without a ":" so that an empty variable takes
> precedence over the stock list (i.e., allowing nothing). Would you want
> the same behavior for the config variable? I.e.:
> 
>   # this should probably allow nothing, right?
>   git config core.allowProtocol ""
> 
> I think you'd have to check the return code of "git config" to
> distinguish those cases.

Oh, I didn't think of that case.  That can be done easy enough, just
makes the code a bit more verbose.

> 
> > diff --git a/transport.c b/transport.c
> > index d57e8de..b1098cd 100644
> > --- a/transport.c
> > +++ b/transport.c
> > @@ -652,7 +652,7 @@ static const struct string_list 
> > *protocol_whitelist(void)
> >  
> > if (enabled < 0) {
> > const char *v = getenv("GIT_ALLOW_PROTOCOL");
> > -   if (v) {
> > +   if (v || !git_config_get_value("core.allowProtocol", &v)) {
> > string_list_split(&allowed, v, ':', -1);
> > string_list_sort(&allowed);
> > enabled = 1;
> 
> I thought at first we'd have to deal with leaking "v", but "get_value"
> is the "raw" version that gives you the uninterpreted value. I think
> that means it may give you NULL, though if we see an implicit bool like:
> 
>   [core]
>   allowProtocol
> 
> That's nonsense, of course, but we would still segfault. I
> think the easiest way to test is:
> 
>   git -c core.allowProtocol fetch
> 
> which seems to segfault for me with this patch.

what is the desired behavior when a user provides a config in a way that
isn't intended?

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Jeff King
On Wed, Nov 02, 2016 at 07:05:39PM -0400, Jeff King wrote:

> > +core.allowProtocol::
> > +   Provide a colon-separated list of protocols which are allowed to be
> > +   used with fetch/push/clone. This is useful to restrict recursive
> > +   submodule initialization from an untrusted repository. Any protocol not
> > +   mentioned will be disallowed (i.e., this is a whitelist, not a
> > +   blacklist). If the variable is not set at all, all protocols are
> > +   enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
> > +   used as the protocol whitelist instead of this config option.
> 
> The "not set at all, all protocols are enabled" bit is not quite
> correct, is it? It is true for a top-level fetch, but not for submodule
> recursion (and especially since you are talking about submodule
> recursion immediately before, it is rather confusing).

Heh, just saw that you copied this straight from the discussion of
GIT_ALLOW_PROTOCOL. What idiot wrote the original? :)

It might be worth fixing both places (or possibly just fixing the
original and phrasing this one as "If GIT_ALLOW_PROTOCOL is not set, use
this as the default value; see git(1) for details").

-Peff


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Jeff King
On Wed, Nov 02, 2016 at 03:20:47PM -0700, Brandon Williams wrote:

> Add configuration option 'core.allowProtocol' to allow users to create a
> whitelist of allowed protocols for fetch/push/clone in their gitconfig.
> 
> For git-submodule.sh, fallback to default whitelist only if the user
> hasn't explicitly set `GIT_ALLOW_PROTOCOL` or doesn't have a whitelist
> in their gitconfig.

This says "what", but not "why". What's the use case?

I can see somebody wanting to pare down the whitelist further (e.g.,
because they are carrying ssh credentials that they don't want to use on
behalf of a malicious repo). But in general I'd expect this setting to
be a function of the environment you're operating in, and not the
on-disk config.

Or is the intent to broaden it for cases where you have a clone that
uses some non-standard protocol, and you want it to Just Work on
subsequent recursive fetches?

> +core.allowProtocol::
> + Provide a colon-separated list of protocols which are allowed to be
> + used with fetch/push/clone. This is useful to restrict recursive
> + submodule initialization from an untrusted repository. Any protocol not
> + mentioned will be disallowed (i.e., this is a whitelist, not a
> + blacklist). If the variable is not set at all, all protocols are
> + enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
> + used as the protocol whitelist instead of this config option.

The "not set at all, all protocols are enabled" bit is not quite
correct, is it? It is true for a top-level fetch, but not for submodule
recursion (and especially since you are talking about submodule
recursion immediately before, it is rather confusing).

> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -27,7 +27,8 @@ cd_to_toplevel
>  #
>  # If the user has already specified a set of allowed protocols,
>  # we assume they know what they're doing and use that instead.
> -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> +config_whitelist=$(git config core.allowProtocol)
> +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}

The original uses "=" without a ":" so that an empty variable takes
precedence over the stock list (i.e., allowing nothing). Would you want
the same behavior for the config variable? I.e.:

  # this should probably allow nothing, right?
  git config core.allowProtocol ""

I think you'd have to check the return code of "git config" to
distinguish those cases.

> diff --git a/transport.c b/transport.c
> index d57e8de..b1098cd 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -652,7 +652,7 @@ static const struct string_list *protocol_whitelist(void)
>  
>   if (enabled < 0) {
>   const char *v = getenv("GIT_ALLOW_PROTOCOL");
> - if (v) {
> + if (v || !git_config_get_value("core.allowProtocol", &v)) {
>   string_list_split(&allowed, v, ':', -1);
>   string_list_sort(&allowed);
>   enabled = 1;

I thought at first we'd have to deal with leaking "v", but "get_value"
is the "raw" version that gives you the uninterpreted value. I think
that means it may give you NULL, though if we see an implicit bool like:

  [core]
  allowProtocol

That's nonsense, of course, but we would still segfault. I
think the easiest way to test is:

  git -c core.allowProtocol fetch

which seems to segfault for me with this patch.

-Peff


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Brandon Williams
On 11/02, Stefan Beller wrote:

> > This is useful to restrict recursive
> > +   submodule initialization from an untrusted repository.
> 
> ok. Though as a user submodules may not spring to mind immediately here.
> I think this is generally useful, too. e.g. an admin could put this in
> the system wide
> config to prevent certain protocols from being used.

Oh I pretty much copied the description from what exists for
`GIT_ALLOW_PROTOCOL` which included this bit about submodules.

> > Any protocol not
> > +   mentioned will be disallowed
> 
> For the regular fetch/clone/pull case. For the submodule case we still
> fall back to
> the hardcoded list of known good things?

Yep! This is done by explicitly setting GIT_ALLOW_PROTOCOL to the
hardcoded list if the user hasn't supplied a whitelist.

> 
> > (i.e., this is a whitelist, not a
> > +   blacklist).
> 
> That is very explicit, I'd drop it. However this inspires bike
> shedding on the name:
> What about core.protocolWhitelist instead?

Simply to keep the name similar to the env variable that already exists
for this functionality.

> So the env var is of higher priority than this config.


> > -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> > +config_whitelist=$(git config core.allowProtocol)
> 
> So first we lookup the configured protocols.
> 
> > +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}
> 
> Then if they are not configured use the current hard coded white list.

The lookup of the configured whitelist is done first but wont be used
unless GIT_ALLOW_PROTOCOL is unset.  If neither is set it will fallback
to the hardcoded list.

> Do we have any tests for this that could be extended? (Otherwise we'd
> maybe want to add a test for both the regular case as well as a forbidden
> submodule?)
> 

I can write a couple tests for a v2 of the patch.

-- 
Brandon Williams


Re: [PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Stefan Beller
On Wed, Nov 2, 2016 at 3:20 PM, Brandon Williams  wrote:
> Add configuration option 'core.allowProtocol' to allow users to create a
> whitelist of allowed protocols for fetch/push/clone in their gitconfig.
>
> For git-submodule.sh, fallback to default whitelist only if the user
> hasn't explicitly set `GIT_ALLOW_PROTOCOL` or doesn't have a whitelist
> in their gitconfig.
>
> Signed-off-by: Brandon Williams 
> ---
>  Documentation/config.txt | 9 +
>  git-submodule.sh | 3 ++-
>  transport.c  | 2 +-
>  3 files changed, 12 insertions(+), 2 deletions(-)
>
> diff --git a/Documentation/config.txt b/Documentation/config.txt
> index 27069ac..7f83e40 100644
> --- a/Documentation/config.txt
> +++ b/Documentation/config.txt
> @@ -455,6 +455,15 @@ core.sshCommand::
> the `GIT_SSH_COMMAND` environment variable and is overridden
> when the environment variable is set.
>
> +core.allowProtocol::
> +   Provide a colon-separated list of protocols which are allowed to be
> +   used with fetch/push/clone.

ok.

> This is useful to restrict recursive
> +   submodule initialization from an untrusted repository.

ok. Though as a user submodules may not spring to mind immediately here.
I think this is generally useful, too. e.g. an admin could put this in
the system wide
config to prevent certain protocols from being used.

> Any protocol not
> +   mentioned will be disallowed

For the regular fetch/clone/pull case. For the submodule case we still
fall back to
the hardcoded list of known good things?

> (i.e., this is a whitelist, not a
> +   blacklist).

That is very explicit, I'd drop it. However this inspires bike
shedding on the name:
What about core.protocolWhitelist instead?

>  If the variable is not set at all, all protocols are
> +   enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it 
> is
> +   used as the protocol whitelist instead of this config option.

So the env var is of higher priority than this config.

> +
>  core.ignoreStat::
> If true, Git will avoid using lstat() calls to detect if files have
> changed by setting the "assume-unchanged" bit for those tracked files
> diff --git a/git-submodule.sh b/git-submodule.sh
> index a024a13..ad94c75 100755
> --- a/git-submodule.sh
> +++ b/git-submodule.sh
> @@ -27,7 +27,8 @@ cd_to_toplevel
>  #
>  # If the user has already specified a set of allowed protocols,
>  # we assume they know what they're doing and use that instead.
> -: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
> +config_whitelist=$(git config core.allowProtocol)

So first we lookup the configured protocols.

> +: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}

Then if they are not configured use the current hard coded white list.

>  export GIT_ALLOW_PROTOCOL
>
>  command=
> diff --git a/transport.c b/transport.c
> index d57e8de..b1098cd 100644
> --- a/transport.c
> +++ b/transport.c
> @@ -652,7 +652,7 @@ static const struct string_list *protocol_whitelist(void)
>
> if (enabled < 0) {
> const char *v = getenv("GIT_ALLOW_PROTOCOL");
> -   if (v) {
> +   if (v || !git_config_get_value("core.allowProtocol", &v)) {

This implementation matches what the config promised, I would think.

Do we have any tests for this that could be extended? (Otherwise we'd
maybe want to add a test for both the regular case as well as a forbidden
submodule?)


> string_list_split(&allowed, v, ':', -1);
> string_list_sort(&allowed);
> enabled = 1;
> --
> 2.8.0.rc3.226.g39d4020
>


[PATCH] transport: add core.allowProtocol config option

2016-11-02 Thread Brandon Williams
Add configuration option 'core.allowProtocol' to allow users to create a
whitelist of allowed protocols for fetch/push/clone in their gitconfig.

For git-submodule.sh, fallback to default whitelist only if the user
hasn't explicitly set `GIT_ALLOW_PROTOCOL` or doesn't have a whitelist
in their gitconfig.

Signed-off-by: Brandon Williams 
---
 Documentation/config.txt | 9 +
 git-submodule.sh | 3 ++-
 transport.c  | 2 +-
 3 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/Documentation/config.txt b/Documentation/config.txt
index 27069ac..7f83e40 100644
--- a/Documentation/config.txt
+++ b/Documentation/config.txt
@@ -455,6 +455,15 @@ core.sshCommand::
the `GIT_SSH_COMMAND` environment variable and is overridden
when the environment variable is set.
 
+core.allowProtocol::
+   Provide a colon-separated list of protocols which are allowed to be
+   used with fetch/push/clone. This is useful to restrict recursive
+   submodule initialization from an untrusted repository. Any protocol not
+   mentioned will be disallowed (i.e., this is a whitelist, not a
+   blacklist). If the variable is not set at all, all protocols are
+   enabled. If the `GIT_ALLOW_PROTOCOL` enviornment variable is set, it is
+   used as the protocol whitelist instead of this config option.
+
 core.ignoreStat::
If true, Git will avoid using lstat() calls to detect if files have
changed by setting the "assume-unchanged" bit for those tracked files
diff --git a/git-submodule.sh b/git-submodule.sh
index a024a13..ad94c75 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -27,7 +27,8 @@ cd_to_toplevel
 #
 # If the user has already specified a set of allowed protocols,
 # we assume they know what they're doing and use that instead.
-: ${GIT_ALLOW_PROTOCOL=file:git:http:https:ssh}
+config_whitelist=$(git config core.allowProtocol)
+: ${GIT_ALLOW_PROTOCOL=${config_whitelist:-file:git:http:https:ssh}}
 export GIT_ALLOW_PROTOCOL
 
 command=
diff --git a/transport.c b/transport.c
index d57e8de..b1098cd 100644
--- a/transport.c
+++ b/transport.c
@@ -652,7 +652,7 @@ static const struct string_list *protocol_whitelist(void)
 
if (enabled < 0) {
const char *v = getenv("GIT_ALLOW_PROTOCOL");
-   if (v) {
+   if (v || !git_config_get_value("core.allowProtocol", &v)) {
string_list_split(&allowed, v, ':', -1);
string_list_sort(&allowed);
enabled = 1;
-- 
2.8.0.rc3.226.g39d4020