Re: [PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-23 Thread Jeff King
On Thu, Feb 23, 2017 at 10:08:57PM -0800, Junio C Hamano wrote:

> Anyway, here is an updated one (the part of the patch to t/ is not
> shown as it is unchanged).
> 
> -- >8 --
> Subject: [PATCH] config: use git_config_parse_key() in 
> git_config_parse_parameter()

Looks good. Nice and simple.

-Peff


Re: [PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-23 Thread Junio C Hamano
Jeff King  writes:

>> Backtracking will not fundamentally "fix" parsing of
>> 
>>  a.b=c=.d
>> 
>> between twhse two
>> 
>>  [a "b="] c = ".d"
>>  [a]  b = "c=.d"
>> 
>> unfortunately, I think.  I do not think it is worth doing the "best
>> effort" with erroring out when ambiguous, because there is no way
>> for the end user to disambiguate, unless we introduce a different
>> syntax, at which point we cannot use config_parse_key() anymore.
>
> Ah, yeah, you're right. I thought the problem was just that the "split"
> was too naive, but it really is that the whole syntax is badly
> specified.
>
> I guess "git config --list" suffers from the same problem. You can get
> around it there with "-z", but that probably would not be very pleasant
> here. :)
>
> Probably not worth worrying too much about if nobody is complaining.

Yup.

Anyway, here is an updated one (the part of the patch to t/ is not
shown as it is unchanged).

-- >8 --
Subject: [PATCH] config: use git_config_parse_key() in 
git_config_parse_parameter()

The parsing of one-shot assignments of configuration variables that
come from the command line historically was quite loose and allowed
anything to pass.  It also downcased everything in the variable name,
even a three-level .. name in which
the  part must be treated in a case sensitive manner.

Existing git_config_parse_key() helper is used to parse the variable
name that comes from the command line, i.e. "git config VAR VAL",
and handles these details correctly.  Replace the strbuf_tolower()
call in git_config_parse_parameter() with a call to it to correct
both issues.  git_config_parse_key() does a bit more things that are
not necessary for the purpose of this codepath (e.g. it allocates a
separate buffer to return the canonicalized variable name because it
takes a "const char *" input), but we are not in a performance-critical
codepath here.

Signed-off-by: Junio C Hamano 
---
 config.c | 14 +-
 1 file changed, 9 insertions(+), 5 deletions(-)

diff --git a/config.c b/config.c
index b8cce1dffa..1c1a1520ff 100644
--- a/config.c
+++ b/config.c
@@ -295,7 +295,9 @@ int git_config_parse_parameter(const char *text,
   config_fn_t fn, void *data)
 {
const char *value;
+   char *canonical_name;
struct strbuf **pair;
+   int ret;
 
pair = strbuf_split_str(text, '=', 2);
if (!pair[0])
@@ -313,13 +315,15 @@ int git_config_parse_parameter(const char *text,
strbuf_list_free(pair);
return error("bogus config parameter: %s", text);
}
-   strbuf_tolower(pair[0]);
-   if (fn(pair[0]->buf, value, data) < 0) {
-   strbuf_list_free(pair);
-   return -1;
+
+   if (git_config_parse_key(pair[0]->buf, _name, NULL)) {
+   ret = -1;
+   } else {
+   ret = (fn(canonical_name, value, data) < 0) ? -1 : 0;
+   free(canonical_name);
}
strbuf_list_free(pair);
-   return 0;
+   return ret;
 }
 
 int git_config_from_parameters(config_fn_t fn, void *data)
-- 
2.12.0-rc2-308-gbf7e63c428



Re: [PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-23 Thread Jeff King
On Thu, Feb 23, 2017 at 08:17:44PM -0800, Junio C Hamano wrote:

> > Hmm. I suspect one cannot do:
> >
> >   git -c 'section.subsection with an = in it.key=foo' ...
> >
> > Definitely not a new problem, nor something that should block your
> > patch. But if we want to fix it, I suspect the problem will ultimately
> > involve parsing left-to-right to get the key first, then confirming it
> > has an =, and then the value.
> 
> Backtracking will not fundamentally "fix" parsing of
> 
>   a.b=c=.d
> 
> between twhse two
> 
>   [a "b="] c = ".d"
>   [a]  b = "c=.d"
> 
> unfortunately, I think.  I do not think it is worth doing the "best
> effort" with erroring out when ambiguous, because there is no way
> for the end user to disambiguate, unless we introduce a different
> syntax, at which point we cannot use config_parse_key() anymore.

Ah, yeah, you're right. I thought the problem was just that the "split"
was too naive, but it really is that the whole syntax is badly
specified.

I guess "git config --list" suffers from the same problem. You can get
around it there with "-z", but that probably would not be very pleasant
here. :)

Probably not worth worrying too much about if nobody is complaining.

-Peff


Re: [PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-23 Thread Junio C Hamano
Jeff King  writes:

>>  pair = strbuf_split_str(text, '=', 2);
>>  if (!pair[0])
>
> Hmm. I suspect one cannot do:
>
>   git -c 'section.subsection with an = in it.key=foo' ...
>
> Definitely not a new problem, nor something that should block your
> patch. But if we want to fix it, I suspect the problem will ultimately
> involve parsing left-to-right to get the key first, then confirming it
> has an =, and then the value.

Backtracking will not fundamentally "fix" parsing of

a.b=c=.d

between twhse two

[a "b="] c = ".d"
[a]  b = "c=.d"

unfortunately, I think.  I do not think it is worth doing the "best
effort" with erroring out when ambiguous, because there is no way
for the end user to disambiguate, unless we introduce a different
syntax, at which point we cannot use config_parse_key() anymore.

>> +if (git_config_parse_key(pair[0]->buf, _name, NULL))
>>  return -1;
>> -}
>
> I think git_config_parse_key() will free canonical_name itself if it
> returns failure. But do you need to strbuf_list_free(pair) here?

Yeah, I missed that one.  Thanks.


Re: [PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-23 Thread Jeff King
On Thu, Feb 23, 2017 at 03:19:58PM -0800, Junio C Hamano wrote:

> > But you are right.  config-parse-key does have the simpler string
> > that can just be given to the canonicalize thing and we should be
> > able to reuse it.
> 
> Actually, I think we can just use the existing config_parse_key()
> instead of adding the new function.  It adds one allocation and
> deallocation, but it's not like this is a performance-critical
> codepath that we absolutely avoid extra allocations.  After all, we
> are still using the strbuf-split thing :-/.

Yeah, you're right. This is much nicer, and everything else was
premature optimization.

> -- >8 --
> From: Junio C Hamano 
> Date: Thu, 23 Feb 2017 15:04:40 -0800
> Subject: [PATCH] config: reject invalid VAR in 'git -c VAR=VAL command' and
>  keep subsection intact

Long subject. :)

I'd have just said:

  config: pass variables through git_config_parse_parameter()

That is "what", but the "why" can come in the next paragraph.

> The parsing of one-shot assignments of configuration variables that
> come from the command line historically was quite loose and allowed
> anything to pass.  It also downcased everything in the variable name,
> even a three-level .. name in which
> the  part must be treated in a case sensible manner.
> 
> Existing git_config_parse_key() helper is used to parse the variable
> name that comes from the command line, i.e. "git config VAR VAL",
> and handles these details correctly.  Replace the strbuf_tolower()
> call in git-config_parse_parameter() with a call to it to correct
> both issues.  git_config_parse_key() does a bit more things that are
> not necessary for the purpose of this codepath (e.g. it allocates a
> separate buffer to return the canonicalized variable name because it
> takes a "const char *" input), but we are not in a performance-critical
> codepath here.

Nicely explained.

> diff --git a/config.c b/config.c
> index b8cce1dffa..39f20dcd2c 100644
> --- a/config.c
> +++ b/config.c
> @@ -295,7 +295,9 @@ int git_config_parse_parameter(const char *text,
>  config_fn_t fn, void *data)
>  {
>   const char *value;
> + char *canonical_name;
>   struct strbuf **pair;
> + int ret = 0;
>  
>   pair = strbuf_split_str(text, '=', 2);
>   if (!pair[0])

Hmm. I suspect one cannot do:

  git -c 'section.subsection with an = in it.key=foo' ...

Definitely not a new problem, nor something that should block your
patch. But if we want to fix it, I suspect the problem will ultimately
involve parsing left-to-right to get the key first, then confirming it
has an =, and then the value.

> @@ -313,13 +315,15 @@ int git_config_parse_parameter(const char *text,
>   strbuf_list_free(pair);
>   return error("bogus config parameter: %s", text);
>   }
> - strbuf_tolower(pair[0]);
> - if (fn(pair[0]->buf, value, data) < 0) {
> - strbuf_list_free(pair);
> +
> + if (git_config_parse_key(pair[0]->buf, _name, NULL))
>   return -1;
> - }

I think git_config_parse_key() will free canonical_name itself if it
returns failure. But do you need to strbuf_list_free(pair) here?

Or alternatively:

  int ret = -1;
  if (!parse(...))
  ret = fn(...);

or use a "got out". Whatever. You don't need me to teach you about error
exits. :)

> + ret = (fn(canonical_name, value, data) < 0) ? -1 : 0;
> +
> + free(canonical_name);
>   strbuf_list_free(pair);
> - return 0;
> + return ret;

Looks good.

> diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
> index 923bfc5a26..ea371020fa 100755
> --- a/t/t1300-repo-config.sh
> +++ b/t/t1300-repo-config.sh

I just skimmed these, as they look like the previous ones.

So overall I like it, modulo the minor error-leak.

-Peff


Re: [PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-23 Thread Junio C Hamano
Junio C Hamano  writes:

> Jeff King  writes:
>
>> FWIW, the code looks OK here. It is a shame to duplicate the policy
>> found in git_config_parse_key(), though.
>>
>> I wonder if we could make a master version of that which canonicalizes
>> in-place, and then just wrap it for the git_config_parse_key()
>> interface. Actually, I guess the function you just wrote _is_ that inner
>> function, as long as it learned about the "quiet" flag.
>
> Hmm, I obviously missed an opportunity
> ...
> But you are right.  config-parse-key does have the simpler string
> that can just be given to the canonicalize thing and we should be
> able to reuse it.

Actually, I think we can just use the existing config_parse_key()
instead of adding the new function.  It adds one allocation and
deallocation, but it's not like this is a performance-critical
codepath that we absolutely avoid extra allocations.  After all, we
are still using the strbuf-split thing :-/.

The attached patch shows the updated fix.  It needs a preparatory
code move (not shown here) to make git_config_parse_key() available
to git_config_parse_parameter(), though.

-- >8 --
From: Junio C Hamano 
Date: Thu, 23 Feb 2017 15:04:40 -0800
Subject: [PATCH] config: reject invalid VAR in 'git -c VAR=VAL command' and
 keep subsection intact

The parsing of one-shot assignments of configuration variables that
come from the command line historically was quite loose and allowed
anything to pass.  It also downcased everything in the variable name,
even a three-level .. name in which
the  part must be treated in a case sensible manner.

Existing git_config_parse_key() helper is used to parse the variable
name that comes from the command line, i.e. "git config VAR VAL",
and handles these details correctly.  Replace the strbuf_tolower()
call in git-config_parse_parameter() with a call to it to correct
both issues.  git_config_parse_key() does a bit more things that are
not necessary for the purpose of this codepath (e.g. it allocates a
separate buffer to return the canonicalized variable name because it
takes a "const char *" input), but we are not in a performance-critical
codepath here.

Signed-off-by: Junio C Hamano 
---
 config.c   | 14 
 t/t1300-repo-config.sh | 62 ++
 2 files changed, 71 insertions(+), 5 deletions(-)

diff --git a/config.c b/config.c
index b8cce1dffa..39f20dcd2c 100644
--- a/config.c
+++ b/config.c
@@ -295,7 +295,9 @@ int git_config_parse_parameter(const char *text,
   config_fn_t fn, void *data)
 {
const char *value;
+   char *canonical_name;
struct strbuf **pair;
+   int ret = 0;
 
pair = strbuf_split_str(text, '=', 2);
if (!pair[0])
@@ -313,13 +315,15 @@ int git_config_parse_parameter(const char *text,
strbuf_list_free(pair);
return error("bogus config parameter: %s", text);
}
-   strbuf_tolower(pair[0]);
-   if (fn(pair[0]->buf, value, data) < 0) {
-   strbuf_list_free(pair);
+
+   if (git_config_parse_key(pair[0]->buf, _name, NULL))
return -1;
-   }
+
+   ret = (fn(canonical_name, value, data) < 0) ? -1 : 0;
+
+   free(canonical_name);
strbuf_list_free(pair);
-   return 0;
+   return ret;
 }
 
 int git_config_from_parameters(config_fn_t fn, void *data)
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 923bfc5a26..ea371020fa 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1097,6 +1097,68 @@ test_expect_success 'multiple git -c appends config' '
test_cmp expect actual
 '
 
+test_expect_success 'last one wins: two level vars' '
+
+   # sec.var and sec.VAR are the same variable, as the first
+   # and the last level of a configuration variable name is
+   # case insensitive.
+
+   echo VAL >expect &&
+
+   git -c sec.var=val -c sec.VAR=VAL config --get sec.var >actual &&
+   test_cmp expect actual &&
+   git -c SEC.var=val -c sec.var=VAL config --get sec.var >actual &&
+   test_cmp expect actual &&
+
+   git -c sec.var=val -c sec.VAR=VAL config --get SEC.var >actual &&
+   test_cmp expect actual &&
+   git -c SEC.var=val -c sec.var=VAL config --get sec.VAR >actual &&
+   test_cmp expect actual
+'
+
+test_expect_success 'last one wins: three level vars' '
+
+   # v.a.r and v.A.r are not the same variable, as the middle
+   # level of a three-level configuration variable name is
+   # case sensitive.
+
+   echo val >expect &&
+   git -c v.a.r=val -c v.A.r=VAL config --get v.a.r >actual &&
+   test_cmp expect actual &&
+   git -c v.a.r=val -c v.A.r=VAL config --get V.a.R >actual &&
+   test_cmp expect actual &&
+
+   # v.a.r and V.a.R are the same variable, as the first
+   # and the last level of a 

Re: [PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-22 Thread Junio C Hamano
Jeff King  writes:

> FWIW, the code looks OK here. It is a shame to duplicate the policy
> found in git_config_parse_key(), though.
>
> I wonder if we could make a master version of that which canonicalizes
> in-place, and then just wrap it for the git_config_parse_key()
> interface. Actually, I guess the function you just wrote _is_ that inner
> function, as long as it learned about the "quiet" flag.

Hmm, I obviously missed an opportunity.  I thought about doing a
similar thing with the policy in parse-source, but that side didn't
seem worth doing, as the config-parse-source callgraph is quite a
mess (as it has to parse the .ini like format with line breaks and
comments, not the simple "[.]." thing, it
cannot quite avoid it), and it needs to take advantage of _some_
policy to parse the pieces.

We could loosen the policy implemented by config-parse-key interface
(e.g. change config-parse-source to let anything that begins with a
non-whitespace continue to be processed with get_value(), instead of
only allowing string that begins with isalpha(); similarly loosen
get_value() to allow any non-dot non-space string, not just
iskeychar() bytes) and first turn what is read into the simple
"[.]." format, and then reuse the new
"master" logic to validate.  That would allow us to update the
"master" logic to make it tighter or looser to some degree, but the
source parser still needs to hardcode _some_ policy (e.g. the first
level and the last level names begin with a non-space) that allows
it to guess what "" pieces the contents being parsed from
the .ini looking format meant to express.

But you are right.  config-parse-key does have the simpler string
that can just be given to the canonicalize thing and we should be
able to reuse it.





Re: [PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-22 Thread Jeff King
On Tue, Feb 21, 2017 at 01:24:38PM -0800, Junio C Hamano wrote:

> The parsing of one-shot assignments of configuration variables that
> come from the command line historically was quite loose and allowed
> anything to pass.
> 
> The configuration variable names that come from files are validated
> in git_config_parse_source(), which uses get_base_var() that grabs
> the  (and subsection) while making sure that 
> consists of iskeychar() letters, the function itself that makes sure
> that the first letter in  is isalpha(), and get_value()
> that grabs the remainder of the  name while making sure
> that it consists of iskeychar() letters.
> 
> Perform an equivalent check in canonicalize_config_variable_name()
> to catch invalid configuration variable names that come from the
> command line.

FWIW, the code looks OK here. It is a shame to duplicate the policy
found in git_config_parse_key(), though.

I wonder if we could make a master version of that which canonicalizes
in-place, and then just wrap it for the git_config_parse_key()
interface. Actually, I guess the function you just wrote _is_ that inner
function, as long as it learned about the "quiet" flag.

-Peff


Re: [PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-21 Thread Junio C Hamano
Junio C Hamano  writes:

>   /* find the last dot (we start from the first dot we just found) */
> - for (last_dot = cp; *cp; cp++)
> + for (; *cp; cp++)
>   if (*cp == '.')
>   last_dot = cp;

This line probably needs this fix-up on top.

-- >8 --
Subject: [PATCH] config: squelch stupid compiler warnings

Some compilers do not realize that *cp is always '.' when the loop
to find the last dot begins, and instead gives a useless warning
that says last_dot may be uninitialized.

Squelch it by being a bit more explicit if stupid.

Signed-off-by: Junio C Hamano 
---
 config.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/config.c b/config.c
index e7f7ff1938..90de27853f 100644
--- a/config.c
+++ b/config.c
@@ -239,7 +239,7 @@ static int canonicalize_config_variable_name(char *varname)
return -1; /* no section? */
 
/* find the last dot (we start from the first dot we just found) */
-   for (; *cp; cp++)
+   for (last_dot = cp; *cp; cp++)
if (*cp == '.')
last_dot = cp;
 
-- 
2.12.0-rc2-231-g83a1c8597c



[PATCH v2] config: reject invalid VAR in 'git -c VAR=VAL command'

2017-02-21 Thread Junio C Hamano
The parsing of one-shot assignments of configuration variables that
come from the command line historically was quite loose and allowed
anything to pass.

The configuration variable names that come from files are validated
in git_config_parse_source(), which uses get_base_var() that grabs
the  (and subsection) while making sure that 
consists of iskeychar() letters, the function itself that makes sure
that the first letter in  is isalpha(), and get_value()
that grabs the remainder of the  name while making sure
that it consists of iskeychar() letters.

Perform an equivalent check in canonicalize_config_variable_name()
to catch invalid configuration variable names that come from the
command line.

Signed-off-by: Junio C Hamano 
---

 * Now with an updated test; while writing it it uncovered a bug in
   the original test that expected to fail---they failed alright but
   sometimes failed for a wrong reason.

 config.c   | 48 +---
 t/t1300-repo-config.sh | 16 
 2 files changed, 53 insertions(+), 11 deletions(-)

diff --git a/config.c b/config.c
index 4128debc71..e7f7ff1938 100644
--- a/config.c
+++ b/config.c
@@ -199,32 +199,62 @@ void git_config_push_parameter(const char *text)
strbuf_release();
 }
 
+static inline int iskeychar(int c)
+{
+   return isalnum(c) || c == '-';
+}
+
 /*
  * downcase the  and  in . or
  * .. and do so in place.  
  * is left intact.
+ *
+ * The configuration variable names that come from files are validated
+ * in git_config_parse_source(), which uses get_base_var() that grabs
+ * the  (and subsection) while making sure that 
+ * consists of iskeychar() letters, the function itself that makes
+ * sure that the first letter in  is isalpha(), and
+ * get_value() that grabs the remainder of the  name while
+ * making sure that it consists of iskeychar() letters.  Perform a
+ * matching validation for configuration variables that come from
+ * the command line.
  */
-static void canonicalize_config_variable_name(char *varname)
+static int canonicalize_config_variable_name(char *varname)
 {
-   char *cp, *last_dot;
+   char *cp, *first_dot, *last_dot;
 
/* downcase the first segment */
for (cp = varname; *cp; cp++) {
if (*cp == '.')
break;
+   if (!iskeychar(*cp))
+   return -1;
*cp = tolower(*cp);
}
if (!*cp)
-   return;
+   return -1; /* no dot anywhere? */
+
+   first_dot = cp;
+   if (first_dot == varname)
+   return -1; /* no section? */
 
/* find the last dot (we start from the first dot we just found) */
-   for (last_dot = cp; *cp; cp++)
+   for (; *cp; cp++)
if (*cp == '.')
last_dot = cp;
 
+   if (!last_dot[1])
+   return -1; /* no variable? */
+
/* downcase the last segment */
-   for (cp = last_dot; *cp; cp++)
+   for (cp = last_dot + 1; *cp; cp++) {
+   if (cp == last_dot + 1 && !isalpha(*cp))
+   return -1;
+   else if (!iskeychar(*cp))
+   return -1;
*cp = tolower(*cp);
+   }
+   return 0;
 }
 
 int git_config_parse_parameter(const char *text,
@@ -249,7 +279,8 @@ int git_config_parse_parameter(const char *text,
strbuf_list_free(pair);
return error("bogus config parameter: %s", text);
}
-   canonicalize_config_variable_name(pair[0]->buf);
+   if (canonicalize_config_variable_name(pair[0]->buf))
+   return error("bogus config parameter: %s", text);
if (fn(pair[0]->buf, value, data) < 0) {
strbuf_list_free(pair);
return -1;
@@ -382,11 +413,6 @@ static char *parse_value(void)
}
 }
 
-static inline int iskeychar(int c)
-{
-   return isalnum(c) || c == '-';
-}
-
 static int get_value(config_fn_t fn, void *data, struct strbuf *name)
 {
int c;
diff --git a/t/t1300-repo-config.sh b/t/t1300-repo-config.sh
index 7a16f66a9d..ea371020fa 100755
--- a/t/t1300-repo-config.sh
+++ b/t/t1300-repo-config.sh
@@ -1143,6 +1143,22 @@ test_expect_success 'last one wins: three level vars' '
test_cmp expect actual
 '
 
+for VAR in a .a a. a.0b a."b c". a."b c".0d
+do
+   test_expect_success "git -c $VAR=VAL rejects invalid '$VAR'" '
+   test_must_fail git -c "$VAR=VAL" config -l
+   '
+done
+
+for VAR in a.b a."b c".d
+do
+   test_expect_success "git -c $VAR=VAL works with valid '$VAR'" '
+   echo VAL >expect &&
+   git -c "$VAR=VAL" config --get "$VAR" >actual &&
+   test_cmp expect actual
+   '
+done
+
 test_expect_success 'git -c is not confused by empty environment' '
GIT_CONFIG_PARAMETERS="" git -c x.one=1 config --list
 '
-- 
2.12.0-rc2-222-gff02733afe