Re: git 2.0.0 PROFILE=BUILD check-phase problems with ./t5561-http-backend.sh; GIT_TEST_HTTPD=false problems with t5537-fetch-shallow.sh
On 30 May 2014, Jeff King told this: On Thu, May 29, 2014 at 11:44:37PM +0100, Nix wrote: I observe test failures with git 2.0.0 which are attributable to the change to run network tests by default. I'm lumping them both together into one report because I'm lazy and I've blown too much time on this already. Weird. I also see a strange failure on t5310 when building with PROFILE=BUILD. We get a segfault when reading jgit-produced bitmaps. I don't see that, but this may be a compiler difference, or environmental difference, or something. Tracking it down, we're getting inexplicably bogus data from an mmap'd file (!). Compiling without PROFILE=BUILD, the test passes fine (even with valgrind). If I instrument it like this: diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c index f7f700e..8cafacf 100644 --- a/ewah/ewah_io.c +++ b/ewah/ewah_io.c @@ -119,6 +119,7 @@ int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len) ptr += sizeof(uint32_t); self-buffer_size = self-alloc_size = get_be32(ptr); + warning(got buffer_size of %lu from %lu, self-buffer_size, *(uint32_t *)ptr); ptr += sizeof(uint32_t); self-buffer = ewah_realloc(self-buffer, a regular test-run reads: warning: got buffer_size of 2 from 33554432 warning: got buffer_size of 2 from 33554432 warning: got buffer_size of 3 from 50331648 warning: got buffer_size of 1 from 16777216 warning: got buffer_size of 2 from 33554432 and a PROFILE=BUILD one reads: warning: got buffer_size of 2 from 33554432 warning: got buffer_size of 2 from 33554432 warning: got buffer_size of 3 from 50331648 warning: got buffer_size of 1 from 16777216 warning: got buffer_size of 131072 from 512 I'm willing to believe we're doing something weird that violates the C standard there, but I really can't see it. It's doubly strange because you're not building with -flto at all, all callers of ewah_read_mmap() are in a separate file, and there are no conditionals or loops above the warning() locus. Maybe the error is in a different file, and is causing a nonsense 'len' to get passed in? But I think this *may* be a false alarm. *(uint32_t *)ptr on a uint8_t (in your logging line) is an aliasing violation, and the nearly- uninitialized '512' is pretty much what I'd expect GCC to do now and then, given that. Just a guess. (I'm surprised this doesn't cause a problem *always*, though.) It appears that the Apache daemon is writing to the log slowly enough that its log lines only get there after the testsuite has written its separator, so a bunch of log lines appear to be attached to the wrong test, and the comparison fails. Yeah, that looks to me like what is happening, too. If I put a 'sleep 1' into log_div, it passes. I would think apache would write the log before serving the file, but perhaps not. It does a straight write() via APR, which should work, I'd have thought. This is on a tmpfs, not a network filesystem or anything. (Aside: the flushing semantics are quite tangled, but shouldn't matter, given that we're just doing a write(). Apache calls apr_file_flush() on the logfile, but that's just flushing apr's analogue of stdio buffers: it does nothing if the file is unbuffered -- and the error log file is unbuffered. It does not call apr_file_sync(), but you shouldn't need a sync() for one process to see another process's output.) And like you, I would expect gcov to make things slower, not faster. Could there be something in the environment that I'm more inclined to blame the kernel or the compiler, but I *always* blame the kernel or the compiler. :P I'm not sure what the best fix is. We could check the logfiles after each test instead of at the end, but that will just end up with the same race: we may check them before apache has written them. Yeah, that seems less than useful. -- NULL (void) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
git 2.0.0 PROFILE=BUILD check-phase problems with ./t5561-http-backend.sh; GIT_TEST_HTTPD=false problems with t5537-fetch-shallow.sh
I observe test failures with git 2.0.0 which are attributable to the change to run network tests by default. I'm lumping them both together into one report because I'm lazy and I've blown too much time on this already. I've got Apache 2.2.24 on this box, and t5551-http-fetch-smart.sh fails in a peculiar fashion during the PROFILE=BUILD stage: --- exp 2014-05-29 22:42:50.221599297 + +++ act 2014-05-29 22:42:50.231598452 + @@ -12,10 +12,10 @@ GET /smart/repo.git/objects/info/http-alternates HTTP/1.1 200 - GET /smart/repo.git/objects/41/57d6f47fc8b7cb455fbf6f18d6f47fed49a6a5 HTTP/1.1 200 GET /smart/repo.git/objects/pack/pack-2cb3186872e8768852fcb6b22cdf2182e12f7490.pack HTTP/1.1 200 -GET /smart/repo.git/objects/pack/pack-2cb3186872e8768852fcb6b22cdf2182e12f7490.idx HTTP/1.1 200 ### no git-daemon-export-ok ### +GET /smart/repo.git/objects/pack/pack-2cb3186872e8768852fcb6b22cdf2182e12f7490.idx HTTP/1.1 200 GET /smart_noexport/repo.git/HEAD HTTP/1.1 404 - GET /smart_noexport/repo.git/info/refs HTTP/1.1 404 - GET /smart_noexport/repo.git/objects/info/packs HTTP/1.1 404 - @@ -34,10 +34,10 @@ GET /smart_noexport/repo.git/objects/info/http-alternates HTTP/1.1 200 - GET /smart_noexport/repo.git/objects/41/57d6f47fc8b7cb455fbf6f18d6f47fed49a6a5 HTTP/1.1 200 GET /smart_noexport/repo.git/objects/pack/pack-2cb3186872e8768852fcb6b22cdf2182e12f7490.pack HTTP/1.1 200 -GET /smart_noexport/repo.git/objects/pack/pack-2cb3186872e8768852fcb6b22cdf2182e12f7490.idx HTTP/1.1 200 ### getanyfile true ### +GET /smart_noexport/repo.git/objects/pack/pack-2cb3186872e8768852fcb6b22cdf2182e12f7490.idx HTTP/1.1 200 GET /smart/repo.git/HEAD HTTP/1.1 200 GET /smart/repo.git/info/refs HTTP/1.1 200 GET /smart/repo.git/objects/info/packs HTTP/1.1 200 @@ -45,10 +45,10 @@ GET /smart/repo.git/objects/info/http-alternates HTTP/1.1 200 - GET /smart/repo.git/objects/41/57d6f47fc8b7cb455fbf6f18d6f47fed49a6a5 HTTP/1.1 200 GET /smart/repo.git/objects/pack/pack-2cb3186872e8768852fcb6b22cdf2182e12f7490.pack HTTP/1.1 200 -GET /smart/repo.git/objects/pack/pack-2cb3186872e8768852fcb6b22cdf2182e12f7490.idx HTTP/1.1 200 ### getanyfile false ### +GET /smart/repo.git/objects/pack/pack-2cb3186872e8768852fcb6b22cdf2182e12f7490.idx HTTP/1.1 200 GET /smart/repo.git/HEAD HTTP/1.1 403 - GET /smart/repo.git/info/refs HTTP/1.1 403 - GET /smart/repo.git/objects/info/packs HTTP/1.1 403 - not ok 14 - server request log matches test results # # sed -e # s/^.* \// # s/\// # s/ [1-9][0-9]*\$// # s/^GET /GET / #act $HTTPD_ROOT_PATH/access.log # test_cmp exp act # # failed 1 among 14 test(s) It appears that the Apache daemon is writing to the log slowly enough that its log lines only get there after the testsuite has written its separator, so a bunch of log lines appear to be attached to the wrong test, and the comparison fails. Curiously, I can't make this happen in a conventional 'make check', even though the only relevant components would seem to be bash and httpd: if anything, you'd expect the gcovvery to slow down git and thus make it *more* likely that any race between httpd log syncing and testsuite framework output to the same logfile would be hit... Attempting to work around this by building with GIT_TEST_HTTPD=false doesn't work either: *** t5537-fetch-shallow.sh *** ok 1 - setup ok 2 - setup shallow clone ok 3 - clone from shallow clone ok 4 - fetch from shallow clone ok 5 - fetch --depth from shallow clone ok 6 - fetch --unshallow from shallow clone ok 7 - fetch something upstream has but hidden by clients shallow boundaries ok 8 - fetch that requires changes in .git/shallow is filtered ok 9 - fetch --update-shallow error: Can't use skip_all after running some tests Makefile:43: recipe for target 't5537-fetch-shallow.sh' failed make[3]: *** [t5537-fetch-shallow.sh] Error 1 since this is trying to run the httpd halfway through the test, which will never work if it's skipping it. Moving the httpd sourcing to the top of the test isn't going to work either, because that would skip *everything*, when we want to skip only the httpd bits. Maybe splitting the httpd bits into a separate test is best here? I'm not sure. -- NULL (void) -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: git 2.0.0 PROFILE=BUILD check-phase problems with ./t5561-http-backend.sh; GIT_TEST_HTTPD=false problems with t5537-fetch-shallow.sh
On Thu, May 29, 2014 at 11:44:37PM +0100, Nix wrote: I observe test failures with git 2.0.0 which are attributable to the change to run network tests by default. I'm lumping them both together into one report because I'm lazy and I've blown too much time on this already. Weird. I also see a strange failure on t5310 when building with PROFILE=BUILD. We get a segfault when reading jgit-produced bitmaps. Tracking it down, we're getting inexplicably bogus data from an mmap'd file (!). Compiling without PROFILE=BUILD, the test passes fine (even with valgrind). If I instrument it like this: diff --git a/ewah/ewah_io.c b/ewah/ewah_io.c index f7f700e..8cafacf 100644 --- a/ewah/ewah_io.c +++ b/ewah/ewah_io.c @@ -119,6 +119,7 @@ int ewah_read_mmap(struct ewah_bitmap *self, void *map, size_t len) ptr += sizeof(uint32_t); self-buffer_size = self-alloc_size = get_be32(ptr); + warning(got buffer_size of %lu from %lu, self-buffer_size, *(uint32_t *)ptr); ptr += sizeof(uint32_t); self-buffer = ewah_realloc(self-buffer, a regular test-run reads: warning: got buffer_size of 2 from 33554432 warning: got buffer_size of 2 from 33554432 warning: got buffer_size of 3 from 50331648 warning: got buffer_size of 1 from 16777216 warning: got buffer_size of 2 from 33554432 and a PROFILE=BUILD one reads: warning: got buffer_size of 2 from 33554432 warning: got buffer_size of 2 from 33554432 warning: got buffer_size of 3 from 50331648 warning: got buffer_size of 1 from 16777216 warning: got buffer_size of 131072 from 512 I'm willing to believe we're doing something weird that violates the C standard there, but I really can't see it. Anyway, that's a side note to your problem... It appears that the Apache daemon is writing to the log slowly enough that its log lines only get there after the testsuite has written its separator, so a bunch of log lines appear to be attached to the wrong test, and the comparison fails. Yeah, that looks to me like what is happening, too. If I put a 'sleep 1' into log_div, it passes. I would think apache would write the log before serving the file, but perhaps not. And like you, I would expect gcov to make things slower, not faster. Could there be something in the environment that I'm not sure what the best fix is. We could check the logfiles after each test instead of at the end, but that will just end up with the same race: we may check them before apache has written them. Attempting to work around this by building with GIT_TEST_HTTPD=false doesn't work either: *** t5537-fetch-shallow.sh *** ok 1 - setup ok 2 - setup shallow clone ok 3 - clone from shallow clone ok 4 - fetch from shallow clone ok 5 - fetch --depth from shallow clone ok 6 - fetch --unshallow from shallow clone ok 7 - fetch something upstream has but hidden by clients shallow boundaries ok 8 - fetch that requires changes in .git/shallow is filtered ok 9 - fetch --update-shallow error: Can't use skip_all after running some tests Makefile:43: recipe for target 't5537-fetch-shallow.sh' failed make[3]: *** [t5537-fetch-shallow.sh] Error 1 Hrm. This already came up, and we dealt with it in 0232852 (t5537: move http tests out to t5539, 2014-02-13). Which is in v2.0.0. But somehow the code is back in v2.0.0 Presumably this is the result of a mis-merge. I'll send a patch. -Peff -- To unsubscribe from this list: send the line unsubscribe git in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html