Re: Test failures in t4034
Ramsay Jones writes: > Yes, there was a net increase in the line count when I introduced > die(), but the main program flow was less cluttered by error handling. > The net result looked much better, so I thought it was worth it. > > What may not be too obvious, however, is that test-regex.c was written > to be independent of git. That part I was very aware of actually; it it is a bit tricky to tell what the right thing to do, though. Your test itself needs to be pretty much portable without the portability help you would get git-compat-util.h, but the point of this kind of test is to tell if you want to define preprocessor macros that may affect the behaviour of such compatibility layer ;-) > Given that I'm now building it as part of git, I should have simply > #included and used the die() routine from libgit.a > (since I'm now *relying* on test-regex being linked with libgit.a). OK. >>> +int main(int argc, char **argv) >>> +{ >>> + char *pat = "[^={} \t]+"; >>> + char *str = "={}\nfred"; >>> + regex_t r; >>> + regmatch_t m[1]; >>> + >>> + if (regcomp(&r, pat, REG_EXTENDED | REG_NEWLINE)) >>> + die("failed regcomp() for pattern '%s'", pat); >>> + if (regexec(&r, str, 1, m, 0)) >>> + die("no match of pattern '%s' to string '%s'", pat, str); >>> + >>> + /* http://sourceware.org/bugzilla/show_bug.cgi?id=3957 */ >>> + if (m[0].rm_so == 3) /* matches '\n' when it should not */ >>> + exit(1); >> >> This could be the third call site of die() that tells the user to >> build with NO_REGEX=1. Then "cd t && sh t0070-fundamental.sh -i -v" would >> give that message directly to the user. > > Hmm, even without "-i -v", it's *very* clear what is going on, but sure > it wouldn't hurt either. (Also, I wanted to be able to distinguish an exit > via die() from a "test failed" error return). OK. Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Test failures in t4034
Junio C Hamano wrote: > Ramsay Jones writes: > [snip] >> diff --git a/test-regex.c b/test-regex.c >> new file mode 100644 >> index 000..9259985 >> --- /dev/null >> +++ b/test-regex.c >> @@ -0,0 +1,35 @@ >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +static void die(const char *fmt, ...) >> +{ >> +va_list p; >> + >> +va_start(p, fmt); >> +vfprintf(stderr, fmt, p); >> +va_end(p); >> +fputc('\n', stderr); >> +exit(128); >> +} > > Looks like a bit of overkill for only two call sites, whose output > we would never see because it is behind the test, but OK. Yes, there was a net increase in the line count when I introduced die(), but the main program flow was less cluttered by error handling. The net result looked much better, so I thought it was worth it. What may not be too obvious, however, is that test-regex.c was written to be independent of git. You should be able to compile the (single) file on any POSIX system to determine if the system regex routines suffer this problem. (It was also supposed to be quiet, unless it die()-ed, and provide the result via the exit code). Given that I'm now building it as part of git, I should have simply #included and used the die() routine from libgit.a (since I'm now *relying* on test-regex being linked with libgit.a). >> +int main(int argc, char **argv) >> +{ >> +char *pat = "[^={} \t]+"; >> +char *str = "={}\nfred"; >> +regex_t r; >> +regmatch_t m[1]; >> + >> +if (regcomp(&r, pat, REG_EXTENDED | REG_NEWLINE)) >> +die("failed regcomp() for pattern '%s'", pat); >> +if (regexec(&r, str, 1, m, 0)) >> +die("no match of pattern '%s' to string '%s'", pat, str); >> + >> +/* http://sourceware.org/bugzilla/show_bug.cgi?id=3957 */ >> +if (m[0].rm_so == 3) /* matches '\n' when it should not */ >> +exit(1); > > This could be the third call site of die() that tells the user to > build with NO_REGEX=1. Then "cd t && sh t0070-fundamental.sh -i -v" would > give that message directly to the user. Hmm, even without "-i -v", it's *very* clear what is going on, but sure it wouldn't hurt either. (Also, I wanted to be able to distinguish an exit via die() from a "test failed" error return). So, new (tested) version of the patch comming. ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Test failures in t4034
Ramsay Jones writes: > I think that, after some testing, this (or something like it) is the > best that we can do. What do you think? > > ATB, > Ramsay Jones > > -- >8 -- > Subject: [PATCH] test-regex: Add a test to check for a bug in the regex > routines > > Signed-off-by: Ramsay Jones > --- > diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh > index 9bee8bf..da2c504 100755 > --- a/t/t0070-fundamental.sh > +++ b/t/t0070-fundamental.sh > @@ -25,4 +25,9 @@ test_expect_success POSIXPERM 'mktemp to unwritable > directory prints filename' ' > grep "cannotwrite/test" err > ' > > +test_expect_success 'check for a bug in the regex routines' ' > + # if this test fails, re-build git with NO_REGEX=1 > + test-regex > +' OK. > test_done > diff --git a/test-regex.c b/test-regex.c > new file mode 100644 > index 000..9259985 > --- /dev/null > +++ b/test-regex.c > @@ -0,0 +1,35 @@ > +#include > +#include > +#include > +#include > +#include > + > +static void die(const char *fmt, ...) > +{ > + va_list p; > + > + va_start(p, fmt); > + vfprintf(stderr, fmt, p); > + va_end(p); > + fputc('\n', stderr); > + exit(128); > +} Looks like a bit of overkill for only two call sites, whose output we would never see because it is behind the test, but OK. > +int main(int argc, char **argv) > +{ > + char *pat = "[^={} \t]+"; > + char *str = "={}\nfred"; > + regex_t r; > + regmatch_t m[1]; > + > + if (regcomp(&r, pat, REG_EXTENDED | REG_NEWLINE)) > + die("failed regcomp() for pattern '%s'", pat); > + if (regexec(&r, str, 1, m, 0)) > + die("no match of pattern '%s' to string '%s'", pat, str); > + > + /* http://sourceware.org/bugzilla/show_bug.cgi?id=3957 */ > + if (m[0].rm_so == 3) /* matches '\n' when it should not */ > + exit(1); This could be the third call site of die() that tells the user to build with NO_REGEX=1. Then "cd t && sh t0070-fundamental.sh -i -v" would give that message directly to the user. > + exit(0); > +} Thanks. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Test failures in t4034
Junio C Hamano wrote: > Ramsay Jones writes: > >> I had the same problem (or at least it *looks* like the same problem) on >> Linux >> last year (May 2011), which turned out to be a bug in the regex routines in >> an >> old version of glibc. >> >> I don't know OS X at all, so this may not be relevent; does OS X use glibc? >> (I didn't think so, but ...) >> >> I sent some patches to the list which may be helpful. I can't get to gmane to >> look up a reference, but you need to search for: >> >> [RFC/PATCH] userdiff.c: Avoid old glibc regex bug causing t4034-*.sh >> test failures >> >> sent on 3rd May 2011. > > Thanks; that's $gmane/172676 for people who prefer easier to read > threading interface. > >> Also, in the same thread, a reply to Jonathan Nieder on 7th May contains a >> test which checks whether your regex routines suffer this bug. >> >> These patches were not applied since I didn't think this would be a common >> problem. I simply set NO_REGEX=1 in my config.mak, since the compat/ regex >> routines don't suffer from this problem. > > You also said: > > This is an RFC because: >- A simple fix would be for me to put NO_REGEX=1 in my config.mak, > since the compat/regex routines don't suffer this problem. >- I suspect this bug is old enough that it will not affect many users. >- I have not audited the other non-matching list expressions in > userdiff.c >- blame, grep and pickaxe all call regcomp() with the REG_NEWLINE > flag, but get the regex from the user (eg from command line). > > I think: > > - the second "this is old enough" assumption was broken again by >Brian this week ;-) > > - the first "Use NO_REGEX if your regexp library is broken" is a >reasonable thing to do; is this something we may want to throw >into the platform specific section of the top-level Makefile? > > - among the fourth, "blame" and "grep" goes line by line, and even >though pickaxe is primarily meant to take multi-line pattern, I >do not think people give multi-line pattern when they use it in >the regexp mode. So I do not think they pose a real issue even >though they get an arbitrary pattern from the user. > > - the third, combined with the fact that end user can define their >own pattern, is a killer. We cannot really afford to let broken >regex library to break us. > > I think a sensible way to go in the longer term, while we wait these > old regexp libraries die out, is to help people to avoid building > git without NO_REGEX on platforms where they need it. Agreed. Did you take a look at the second patch I mentioned above? I've included a rebased version (onto v1.7.12) of that patch below. NOTE: I have not even attempted to compile this version of the patch and I can't remember how much testing I did last year, so this is included *only* for discussion purposes ... I think that, after some testing, this (or something like it) is the best that we can do. What do you think? ATB, Ramsay Jones -- >8 -- Subject: [PATCH] test-regex: Add a test to check for a bug in the regex routines Signed-off-by: Ramsay Jones --- .gitignore | 1 + Makefile | 1 + t/t0070-fundamental.sh | 5 + test-regex.c | 35 +++ 4 files changed, 42 insertions(+) create mode 100644 test-regex.c diff --git a/.gitignore b/.gitignore index bb5c91e..68fe464 100644 --- a/.gitignore +++ b/.gitignore @@ -189,6 +189,7 @@ /test-mktemp /test-parse-options /test-path-utils +/test-regex /test-revision-walking /test-run-command /test-sha1 diff --git a/Makefile b/Makefile index 6b0c961..3b760d3 100644 --- a/Makefile +++ b/Makefile @@ -496,6 +496,7 @@ TEST_PROGRAMS_NEED_X += test-mergesort TEST_PROGRAMS_NEED_X += test-mktemp TEST_PROGRAMS_NEED_X += test-parse-options TEST_PROGRAMS_NEED_X += test-path-utils +TEST_PROGRAMS_NEED_X += test-regex TEST_PROGRAMS_NEED_X += test-revision-walking TEST_PROGRAMS_NEED_X += test-run-command TEST_PROGRAMS_NEED_X += test-scrap-cache-tree diff --git a/t/t0070-fundamental.sh b/t/t0070-fundamental.sh index 9bee8bf..da2c504 100755 --- a/t/t0070-fundamental.sh +++ b/t/t0070-fundamental.sh @@ -25,4 +25,9 @@ test_expect_success POSIXPERM 'mktemp to unwritable directory prints filename' ' grep "cannotwrite/test" err ' +test_expect_success 'check for a bug in the regex routines' ' + # if this test fails, re-build git with NO_REGEX=1 + test-regex +' + test_done diff --git a/test-regex.c b/test-regex.c new file mode 100644 index 000..9259985 --- /dev/null +++ b/test-regex.c @@ -0,0 +1,35 @@ +#include +#include +#include +#include +#include + +static void die(const char *fmt, ...) +{ + va_list p; + + va_start(p, fmt); + vfprintf(stderr, fmt, p); + va_end(p); + fputc('\n', stderr); + exit(128); +} + +int main(int argc, char **argv) +{ + char *pat = "[^={} \t]+"; + char *str = "=
Re: Test failures in t4034
Ramsay Jones writes: > I had the same problem (or at least it *looks* like the same problem) on Linux > last year (May 2011), which turned out to be a bug in the regex routines in an > old version of glibc. > > I don't know OS X at all, so this may not be relevent; does OS X use glibc? > (I didn't think so, but ...) > > I sent some patches to the list which may be helpful. I can't get to gmane to > look up a reference, but you need to search for: > > [RFC/PATCH] userdiff.c: Avoid old glibc regex bug causing t4034-*.sh test > failures > > sent on 3rd May 2011. Thanks; that's $gmane/172676 for people who prefer easier to read threading interface. > Also, in the same thread, a reply to Jonathan Nieder on 7th May contains a > test which checks whether your regex routines suffer this bug. > > These patches were not applied since I didn't think this would be a common > problem. I simply set NO_REGEX=1 in my config.mak, since the compat/ regex > routines don't suffer from this problem. You also said: This is an RFC because: - A simple fix would be for me to put NO_REGEX=1 in my config.mak, since the compat/regex routines don't suffer this problem. - I suspect this bug is old enough that it will not affect many users. - I have not audited the other non-matching list expressions in userdiff.c - blame, grep and pickaxe all call regcomp() with the REG_NEWLINE flag, but get the regex from the user (eg from command line). I think: - the second "this is old enough" assumption was broken again by Brian this week ;-) - the first "Use NO_REGEX if your regexp library is broken" is a reasonable thing to do; is this something we may want to throw into the platform specific section of the top-level Makefile? - among the fourth, "blame" and "grep" goes line by line, and even though pickaxe is primarily meant to take multi-line pattern, I do not think people give multi-line pattern when they use it in the regexp mode. So I do not think they pose a real issue even though they get an arbitrary pattern from the user. - the third, combined with the fact that end user can define their own pattern, is a killer. We cannot really afford to let broken regex library to break us. I think a sensible way to go in the longer term, while we wait these old regexp libraries die out, is to help people to avoid building git without NO_REGEX on platforms where they need it. Thanks for digging an old article. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Test failures in t4034
Am 19.08.2012 16:50, schrieb Ramsay Jones: > Brian Gernhardt wrote: >> I've been getting a couple of test failures and finally had the >> time to track them down. >> >> t4034-diff-words fails tests "22 diff driver 'bibtex'" and "26 diff >> driver 'html'". Bisecting shows that the file started giving me >> errors in commit 8d96e72 "t4034: bulk verify builtin word regex >> sanity", which appears to introduce those tests. I don't see >> anything obviously wrong with the tests and I'm not familiar with >> the diff-words code, so I'm not sure what's wrong. >> >> I am running on OS X 10.8, with Xcode 4.4.1 (llvm-gcc 4.2.1). > > I had the same problem (or at least it *looks* like the same problem) > on Linux last year (May 2011), which turned out to be a bug in the > regex routines in an old version of glibc. I also had the same problem, but did not remember why I don't have it anymore. Now that you mention it: It was the same situation and I came to the same conclusion (old glibc, bogus regex implementation). I worked it around with NO_REGEX=YesPlease in config.mak. -- Hannes -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Test failures in t4034
Brian Gernhardt wrote: > I've been getting a couple of test failures and finally had the time to track > them down. > > t4034-diff-words fails tests "22 diff driver 'bibtex'" and "26 diff driver > 'html'". Bisecting shows that the file started giving me errors in commit > 8d96e72 "t4034: bulk verify builtin word regex sanity", which appears to > introduce those tests. I don't see anything obviously wrong with the tests > and I'm not familiar with the diff-words code, so I'm not sure what's wrong. > > I am running on OS X 10.8, with Xcode 4.4.1 (llvm-gcc 4.2.1). I had the same problem (or at least it *looks* like the same problem) on Linux last year (May 2011), which turned out to be a bug in the regex routines in an old version of glibc. I don't know OS X at all, so this may not be relevent; does OS X use glibc? (I didn't think so, but ...) I sent some patches to the list which may be helpful. I can't get to gmane to look up a reference, but you need to search for: [RFC/PATCH] userdiff.c: Avoid old glibc regex bug causing t4034-*.sh test failures sent on 3rd May 2011. Also, in the same thread, a reply to Jonathan Nieder on 7th May contains a test which checks whether your regex routines suffer this bug. These patches were not applied since I didn't think this would be a common problem. I simply set NO_REGEX=1 in my config.mak, since the compat/ regex routines don't suffer from this problem. HTH ATB, Ramsay Jones -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Test failures in t4034
Brian Gernhardt writes: > I wonder if there is something non-portable about the bibtex > filter? (The HTML filter as well, since that errors on my machine > too.) Yeah, that I didn't think of, but is a possibility (part of (1) above). The HTML one is "[^<>= \t]+" and the Bibtex one is "[={}\"]|[^={}\" \t]+" and both will be used with "|[^[:space:]]|[\xc0-\xff][\x80-\xbf]+" appended and given to regcomp with REG_EXTENDED|REG_NEWLINE. Nothing jumps at me that is common to these two but not shared by other patterns. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Test failures in t4034
Brian Gernhardt writes: > I've been getting a couple of test failures and finally had the time to track > them down. > > t4034-diff-words fails tests "22 diff driver 'bibtex'" and "26 > diff driver 'html'". Bisecting shows that the file started giving > me errors in commit 8d96e72 "t4034: bulk verify builtin word regex > sanity", which appears to introduce those tests. I don't see > anything obviously wrong with the tests and I'm not familiar with > the diff-words code, so I'm not sure what's wrong. > > I am running on OS X 10.8, with Xcode 4.4.1 (llvm-gcc 4.2.1). > > Test results follow: > > -- 8< -- > > expecting success: > cp "$TEST_DIRECTORY/t4034/bibtex/pre" \ > "$TEST_DIRECTORY/t4034/bibtex/post" \ > "$TEST_DIRECTORY/t4034/bibtex/expect" . && > echo "* diff=bibtex" >.gitattributes && > word_diff --color-words > > --- expect2012-08-18 05:54:29.0 + > +++ output.decrypted 2012-08-18 05:54:29.0 + > @@ -8,8 +8,8 @@ >author={Aldous, D.David}, >journal={Information Theory, IEEE Transactions on}, >volume={33Bogus.}, > - number={24}, > + number={4}, >pages={219--223}, > - year=1987, > - note={This is in fact a rather funny read since ethernet works well > in practice. The {1987\em pre} reference is the > right one, however.}, > + year={1987},1987, > + note={This is in fact a rather funny read since ethernet works well in > practice. The {\em pre} reference is the right one, however.} > } > not ok - 22 diff driver 'bibtex' Thanks for a report. Off the top of my head, there may be three possibilities. (1) The compiled binary of Git is broken on your platform and not formatting the escape sequence correctly. I somehow think it is very unlikely, as the code to do so is pretty much platform agonistic (color.c does not use anything fancy from system libraries). (2) The test script, the part that converts the escape sequence to human readable form, is broken---not written in a portable awk. (3) The implementation of awk on your platform was broken by your supplier, with the same infinite wisdom they broke the UTF-8 pathnames on their filesystem implementation with ;-) Can you help isolating the issue first to see if it is (1) or one of the other two? Run "cd t && sh t4034-diff-words -i" to force stop the test upon the first breakage, and inspect the "output" before the awk script test_decode_color munges it. Does it show a red number 2 and green number 4 on the line that begins with "number=" (or if you have an access to a box on which this test passes, grab the raw output from it by running this test, and make byte-for-byte comparison)? -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Test failures in t4034
I've been getting a couple of test failures and finally had the time to track them down. t4034-diff-words fails tests "22 diff driver 'bibtex'" and "26 diff driver 'html'". Bisecting shows that the file started giving me errors in commit 8d96e72 "t4034: bulk verify builtin word regex sanity", which appears to introduce those tests. I don't see anything obviously wrong with the tests and I'm not familiar with the diff-words code, so I'm not sure what's wrong. I am running on OS X 10.8, with Xcode 4.4.1 (llvm-gcc 4.2.1). Test results follow: -- 8< -- expecting success: cp "$TEST_DIRECTORY/t4034/bibtex/pre" \ "$TEST_DIRECTORY/t4034/bibtex/post" \ "$TEST_DIRECTORY/t4034/bibtex/expect" . && echo "* diff=bibtex" >.gitattributes && word_diff --color-words --- expect 2012-08-18 05:54:29.0 + +++ output.decrypted2012-08-18 05:54:29.0 + @@ -8,8 +8,8 @@ author={Aldous, D.David}, journal={Information Theory, IEEE Transactions on}, volume={33Bogus.}, - number={24}, + number={4}, pages={219--223}, - year=1987, - note={This is in fact a rather funny read since ethernet works well in practice. The {1987\em pre} reference is the right one, however.}, + year={1987},1987, + note={This is in fact a rather funny read since ethernet works well in practice. The {\em pre} reference is the right one, however.} } not ok - 22 diff driver 'bibtex' -- 8< -- expecting success: cp "$TEST_DIRECTORY/t4034/html/pre" \ "$TEST_DIRECTORY/t4034/html/post" \ "$TEST_DIRECTORY/t4034/html/expect" . && echo "* diff=html" >.gitattributes && word_diff --color-words --- expect 2012-08-18 05:54:29.0 + +++ output.decrypted2012-08-18 05:54:29.0 + @@ -4,5 +4,5 @@ +++ b/post @@ -1,3 +1,3 @@ newattr="newvalue">added content -"value""newvalue">contentchanged -content &entity;&newentity; +changed +content &newentity; not ok - 26 diff driver 'html'-- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html