[PATCH/RFC] attr: allow pattern escape using backslash
.gitattributes pattern syntax is supposed to be the same as .gitignore (except a few things that do not make sense in attr context, but that's a different issue). .gitignore uses fnmatch() as the matching machinery and \ is accepted as an escape code. In theory the pattern 'foo\ bar' should match path 'foo bar' in .gitignore. Granted, no one would write 'foo\ bar' in .gitignore when 'foo bar' should suffice. Regardless, 'foo\ bar attr' does not (but should) attach attr to path foo bar because pattern/attr parse code does not understand backslash escape. It parses the line as path 'foo\' and attributes 'bar' and 'attr'. This patch teaches attr code to recognize the backslash in patterns (not macro names) and pass 'foo\ bar' down to fnmatch(). This changes the attr behavior. foo\ attr, if exists in the field, would match nothing because path foo\ is invalid in UNIX and is a directory in Windows, which we do not accept attaching attributes to. With this patch, that line becomes invalid and is rejected. So it's not really bad (i.e. no silent changes in behavior). Other subtle behavioral changes may happen. Still, I think we should do this as it seems like a correct thing to do. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com --- It was posted [1] during rc cycles (I think) and initial feedback from Junio was not utterly wrong. I still think this is worth doing, hence this resend for discussion. [1] http://thread.gmane.org/gmane.comp.version-control.git/207135 attr.c| 12 +++- t/t0003-attributes.sh | 5 + 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/attr.c b/attr.c index 887a9ae..173d51d 100644 --- a/attr.c +++ b/attr.c @@ -221,8 +221,18 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, return NULL; } } - else + else { is_macro = 0; + namelen = 0; + while (name[namelen] != '\0' + !strchr(blank, name[namelen])) { + if (name[namelen] == '\\' + name[namelen + 1] != '\0') + namelen += 2; + else + namelen++; + } + } states = name + namelen; states += strspn(states, blank); diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh index febc45c..16b419e 100755 --- a/t/t0003-attributes.sh +++ b/t/t0003-attributes.sh @@ -24,6 +24,7 @@ test_expect_success 'setup' ' echo offon -test test echo no notest echo A/e/F test=A/e/F + echo A\\ b test=space ) .gitattributes ( echo g test=a/g @@ -196,6 +197,10 @@ test_expect_success 'root subdir attribute test' ' attr_check subdir/a/i unspecified ' +test_expect_success 'quoting in pattern' ' + attr_check A b space +' + test_expect_success 'setup bare' ' git clone --bare . bare.git cd bare.git -- 1.7.12.1.406.g6ab07c4 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH/RFC] attr: allow pattern escape using backslash
.gitattributes pattern syntax is supposed to be the same as .gitignore (except a few things that do not make sense in attr context, but that's a different issue). .gitignore uses fnmatch() as the matching machinery and \ is accepted as an escape code. In theory the pattern 'foo\ bar' should match path 'foo bar' in .gitignore. Granted, no one would write 'foo\ bar' in .gitignore when 'foo bar' should suffice. Regardless, 'foo\ bar attr' does not (but should) attach attr to path foo bar because pattern/attr parse code does not understand backslash escape. It parses the line as path 'foo\' and attributes 'bar' and 'attr'. This patch teaches attr code to recognize the backslash in patterns (not macro names) and pass 'foo\ bar' down to fnmatch(). This changes the attr behavior. foo\ attr, if exists in the field, would match nothing because path foo\ is invalid in UNIX and is a directory in Windows, which we do not accept attaching attributes to. With this patch, that line becomes invalid and is rejected. So it's not really bad (i.e. no silent changes in behavior). Other subtle behavioral changes may happen. Still, I think we should do this as it seems like a correct thing to do. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com --- We discussed the spaces in path names in attr before and I remember using quotes or double quotes, even substituting spaces with dots, were raised. I don't remember if backslashes were mentioned. attr.c| 12 +++- t/t0003-attributes.sh | 5 + 2 files changed, 16 insertions(+), 1 deletion(-) diff --git a/attr.c b/attr.c index 887a9ae..173d51d 100644 --- a/attr.c +++ b/attr.c @@ -221,8 +221,18 @@ static struct match_attr *parse_attr_line(const char *line, const char *src, return NULL; } } - else + else { is_macro = 0; + namelen = 0; + while (name[namelen] != '\0' + !strchr(blank, name[namelen])) { + if (name[namelen] == '\\' + name[namelen + 1] != '\0') + namelen += 2; + else + namelen++; + } + } states = name + namelen; states += strspn(states, blank); diff --git a/t/t0003-attributes.sh b/t/t0003-attributes.sh index febc45c..16b419e 100755 --- a/t/t0003-attributes.sh +++ b/t/t0003-attributes.sh @@ -24,6 +24,7 @@ test_expect_success 'setup' ' echo offon -test test echo no notest echo A/e/F test=A/e/F + echo A\\ b test=space ) .gitattributes ( echo g test=a/g @@ -196,6 +197,10 @@ test_expect_success 'root subdir attribute test' ' attr_check subdir/a/i unspecified ' +test_expect_success 'quoting in pattern' ' + attr_check A b space +' + test_expect_success 'setup bare' ' git clone --bare . bare.git cd bare.git -- 1.7.12.1.406.g6ab07c4 -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] attr: allow pattern escape using backslash
Nguyễn Thái Ngọc Duy pclo...@gmail.com writes: .gitattributes pattern syntax is supposed to be the same as .gitignore (except a few things that do not make sense in attr context, but that's a different issue). .gitignore uses fnmatch() as the matching machinery and \ is accepted as an escape code. In theory the pattern 'foo\ bar' should match path 'foo bar' in .gitignore. Granted, no one would write 'foo\ bar' in .gitignore when 'foo bar' should suffice. Regardless, 'foo\ bar attr' does not (but should) attach attr to path foo bar because pattern/attr parse code does not understand backslash escape. It parses the line as path 'foo\' and attributes 'bar' and 'attr'. This patch teaches attr code to recognize the backslash in patterns (not macro names) and pass 'foo\ bar' down to fnmatch(). This changes the attr behavior. foo\ attr, if exists in the field, would match nothing because path foo\ is invalid in UNIX and is a directory in Windows, which we do not accept attaching attributes to. With this patch, that line becomes invalid and is rejected. So it's not really bad (i.e. no silent changes in behavior). Other subtle behavioral changes may happen. Still, I think we should do this as it seems like a correct thing to do. Signed-off-by: Nguyễn Thái Ngọc Duy pclo...@gmail.com --- We discussed the spaces in path names in attr before and I remember using quotes or double quotes, even substituting spaces with dots, were raised. I don't remember if backslashes were mentioned. My knee-jerk reaction, without thinking things through, is that this makes sense and doing the same for .gitignore would, too (even though the one-item-per-line nature of .gitignore makes \ and practically equivalent). Shouldn't we do the same for quoting fnmatch(3) metacharacters? To match a path component 'a' followed by an asterisk followed by 'b', you could then write 'a\*b'. Same for quoting the backslash itself. -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH/RFC] attr: allow pattern escape using backslash
On Sat, Oct 6, 2012 at 2:54 PM, Junio C Hamano gits...@pobox.com wrote: Shouldn't we do the same for quoting fnmatch(3) metacharacters? To match a path component 'a' followed by an asterisk followed by 'b', you could then write 'a\*b'. Same for quoting the backslash itself. I think my patch does that too. The thing it does not do is optimize this case so that we can do strcmp() instead of fnmatch() if the entire pattern does not contain any metacharacters but backslashes. I don't think it'll become popular enough to deserve an optimization. -- Duy -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html