Re: T460: new sporadic failures with emacs 29
Michael J Gruber writes: > > Yes, that's why I wrote "ignore". Something like NOTMUCH_IGNORE_TESTS > which runs the test, outputs the diff on fail, but "succeeds" without > counting towards pass/fail, and reports the number of ignored > pass/fail separately - basically "known_broken" without the > "known/expectation". > > I just don't know whether it's worth it. Other folks disable a whole > test suite when they want to get a package update going ... I guess it would be mainly interesting for distro packagers, or I guess people who wanted to run come kind of CI. Without looking at the code, I think just ignoring the return value would be relatively easy, while keeping track of ignored tests might be a bit more work. Maybe Tomi has a clearer idea / finds this a fun problem. d ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Am Do., 31. Aug. 2023 um 17:17 Uhr schrieb David Bremner : > > Michael J Gruber writes: > > > > > I still get those issues. OTOH, skipping T460.14 did not show any > > adverse side effects. So I'll do that for emacs29. > > I might be nice to mark some tests ignored rather than skipped so that > > we notice when they do not fail sporadically any more. That is, *if* > > we look at the output of a passing test suite ... > > > > It is possible to selectively mark tests as broken, but it requires > patching the test suite, and it sets a failing exit code if those tests > start passing. Yes, that's why I wrote "ignore". Something like NOTMUCH_IGNORE_TESTS which runs the test, outputs the diff on fail, but "succeeds" without counting towards pass/fail, and reports the number of ignored pass/fail separately - basically "known_broken" without the "known/expectation". I just don't know whether it's worth it. Other folks disable a whole test suite when they want to get a package update going ... Michael ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Michael J Gruber writes: > > I still get those issues. OTOH, skipping T460.14 did not show any > adverse side effects. So I'll do that for emacs29. > I might be nice to mark some tests ignored rather than skipped so that > we notice when they do not fail sporadically any more. That is, *if* > we look at the output of a passing test suite ... > It is possible to selectively mark tests as broken, but it requires patching the test suite, and it sets a failing exit code if those tests start passing. ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Am Do., 31. Aug. 2023 um 15:16 Uhr schrieb David Bremner : > > Michael J Gruber writes: > > > Am Sa., 26. Aug. 2023 um 16:41 Uhr schrieb David Bremner > > : > >> > >> Michael J Gruber writes: > >> > >> > > >> > I tried the current 0.38rc1 on COPR, and unfortunately I get the same > >> > T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time, > >> > out of 35 buildroots). > >> > Did you get your fails with emacs 29 only, or with earlier emacs? > >> > >> I only tested emacs 29; it would be some different incantation to > >> semi-disable native compilation for emacs 28.x. Are you seeing those > >> same failures (where emacs attempting to write into /usr/bin) on older > >> emacs? > > > > No, I see them only on Fedora rawhide/ELN and Fedora 39, but not on > > the current release 38 or earlier. Emacs 29/28 is one difference and > > an obvious guess as the cause, but it could be dtach or whatnot. > > Hmm. I just built 200 times in sbuild (chroot) so I guess I am no longer > able to reproduce the issue on Debian, fwiw. I still get those issues. OTOH, skipping T460.14 did not show any adverse side effects. So I'll do that for emacs29. I might be nice to mark some tests ignored rather than skipped so that we notice when they do not fail sporadically any more. That is, *if* we look at the output of a passing test suite ... Michael ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Michael J Gruber writes: > Am Sa., 26. Aug. 2023 um 16:41 Uhr schrieb David Bremner : >> >> Michael J Gruber writes: >> >> > >> > I tried the current 0.38rc1 on COPR, and unfortunately I get the same >> > T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time, >> > out of 35 buildroots). >> > Did you get your fails with emacs 29 only, or with earlier emacs? >> >> I only tested emacs 29; it would be some different incantation to >> semi-disable native compilation for emacs 28.x. Are you seeing those >> same failures (where emacs attempting to write into /usr/bin) on older >> emacs? > > No, I see them only on Fedora rawhide/ELN and Fedora 39, but not on > the current release 38 or earlier. Emacs 29/28 is one difference and > an obvious guess as the cause, but it could be dtach or whatnot. Hmm. I just built 200 times in sbuild (chroot) so I guess I am no longer able to reproduce the issue on Debian, fwiw. ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Am Sa., 26. Aug. 2023 um 16:41 Uhr schrieb David Bremner : > > Michael J Gruber writes: > > > > > I tried the current 0.38rc1 on COPR, and unfortunately I get the same > > T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time, > > out of 35 buildroots). > > Did you get your fails with emacs 29 only, or with earlier emacs? > > I only tested emacs 29; it would be some different incantation to > semi-disable native compilation for emacs 28.x. Are you seeing those > same failures (where emacs attempting to write into /usr/bin) on older > emacs? No, I see them only on Fedora rawhide/ELN and Fedora 39, but not on the current release 38 or earlier. Emacs 29/28 is one difference and an obvious guess as the cause, but it could be dtach or whatnot. I get the same failures with notmuch 0.37+your patch on koji now (rawhide, f39; not f38), sporadically. I'm confident it's only in the test suite, so I can disable that test on Fedora for the release build. (Will have to test whether the failures creep up somewhere else then.) Michael ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Michael J Gruber writes: > > I tried the current 0.38rc1 on COPR, and unfortunately I get the same > T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time, > out of 35 buildroots). > Did you get your fails with emacs 29 only, or with earlier emacs? I only tested emacs 29; it would be some different incantation to semi-disable native compilation for emacs 28.x. Are you seeing those same failures (where emacs attempting to write into /usr/bin) on older emacs? > There's also one patch I want to send out before release, hopefully in > a minute or two ;-) OK ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Am Sa., 26. Aug. 2023 um 00:28 Uhr schrieb David Bremner : > > Michael J Gruber writes: > > > It took more runs to get some fails now, and archs vary, so I still > > think its a time out. And no way to get it locally so far. > > I can duplicate it locally about once every 40 runs of the complete test > suite. > > > ENOLISP (for me) but could it be the case that notmuch-test-wait can > > abort its while loop too early if the first buffer write takes longer > > than the timeout, or if some other process writes (because the process > > parameter is nil)? Is something different for emacs 29 in this regard? > > Any clues from sbuild? > > Can you try the attached patch? It needs more testing, but I did get 140 > runs of the test suite without an error. I tried the current 0.38rc1 on COPR, and unfortunately I get the same T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time, out of 35 buildroots). Did you get your fails with emacs 29 only, or with earlier emacs? Trying with 0.37+patches on KOJI right now. There's also one patch I want to send out before release, hopefully in a minute or two ;-) Michael ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Michael J Gruber writes: > It took more runs to get some fails now, and archs vary, so I still > think its a time out. And no way to get it locally so far. I can duplicate it locally about once every 40 runs of the complete test suite. > ENOLISP (for me) but could it be the case that notmuch-test-wait can > abort its while loop too early if the first buffer write takes longer > than the timeout, or if some other process writes (because the process > parameter is nil)? Is something different for emacs 29 in this regard? > Any clues from sbuild? Can you try the attached patch? It needs more testing, but I did get 140 runs of the test suite without an error. diff --git a/test/test-lib.el b/test/test-lib.el index 236dd99e..709c3b36 100644 --- a/test/test-lib.el +++ b/test/test-lib.el @@ -22,6 +22,10 @@ ;;; Code: +(setq native-comp-jit-compilation nil) +(setq native-comp-speed -1) +(setq native-comp-async-jobs-number 1) + (require 'cl-lib) ;; Ensure that the dynamic variables that are defined by this library ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Am Do., 24. Aug. 2023 um 17:10 Uhr schrieb David Bremner : > > David Bremner writes: > > > I just saw this when running in debian's "sbuild" isolated build > > environment. So my current guess is that this has to do with HOME > > pointing somewhere nonexistent. Is that also the case in COPR? > > > > d > > I realized that we override HOME inside the tests anyway, so emacs > should think there is some writable HOME in any case. I did notice that > the tests trigger a bunch of emacs native compilation (because the > caching happens in the temporary $HOME, which gets blown away every > time). Also, $HOME is set in all my build envs (pass or fail), and permissions are the same. Bummer. It took more runs to get some fails now, and archs vary, so I still think its a time out. And no way to get it locally so far. ENOLISP (for me) but could it be the case that notmuch-test-wait can abort its while loop too early if the first buffer write takes longer than the timeout, or if some other process writes (because the process parameter is nil)? Is something different for emacs 29 in this regard? Any clues from sbuild? Michael ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
David Bremner writes: > I just saw this when running in debian's "sbuild" isolated build > environment. So my current guess is that this has to do with HOME > pointing somewhere nonexistent. Is that also the case in COPR? > > d I realized that we override HOME inside the tests anyway, so emacs should think there is some writable HOME in any case. I did notice that the tests trigger a bunch of emacs native compilation (because the caching happens in the temporary $HOME, which gets blown away every time). ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Am Do., 24. Aug. 2023 um 16:01 Uhr schrieb David Bremner : > > David Bremner writes: > > > Michael J Gruber writes: > > > >> [email protected] > >> -http://notmuchmail.org/mailman/listinfo/notmuch > >> *ERROR*: Opening output file: Permission denied, /usr/bin/OUTPUT > >> PASS Stash id > >> ``` > >> > >> That "/usr/bin/OUTPUT" looks strange and smells like a mis-expanded > >> variable. > > > > Yes, that's pretty weird. The only writes to "OUTPUT" are relative to > > emacs default-directory. Not sure how that could be set to /usr/bin; > > possible some weird script involved with starting emacs? > > I just saw this when running in debian's "sbuild" isolated build > environment. So my current guess is that this has to do with HOME > pointing somewhere nonexistent. Is that also the case in COPR? > I encountered this on koji (the main Fedora infra), too, and am trying with an increased wait (1 rather than 0.1) right now. Dunno by how much this increases test suite run times. HOME could be an issue only if some builder VMs are set-up differently, I guess? They shouldn't be (which does not necessarily mean they aren't). Michael ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
David Bremner writes: > Michael J Gruber writes: > >> [email protected] >> -http://notmuchmail.org/mailman/listinfo/notmuch >> *ERROR*: Opening output file: Permission denied, /usr/bin/OUTPUT >> PASS Stash id >> ``` >> >> That "/usr/bin/OUTPUT" looks strange and smells like a mis-expanded >> variable. > > Yes, that's pretty weird. The only writes to "OUTPUT" are relative to > emacs default-directory. Not sure how that could be set to /usr/bin; > possible some weird script involved with starting emacs? I just saw this when running in debian's "sbuild" isolated build environment. So my current guess is that this has to do with HOME pointing somewhere nonexistent. Is that also the case in COPR? d ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
Re: T460: new sporadic failures with emacs 29
Michael J Gruber writes: > [email protected] > -http://notmuchmail.org/mailman/listinfo/notmuch > *ERROR*: Opening output file: Permission denied, /usr/bin/OUTPUT > PASS Stash id > ``` > > That "/usr/bin/OUTPUT" looks strange and smells like a mis-expanded > variable. Yes, that's pretty weird. The only writes to "OUTPUT" are relative to emacs default-directory. Not sure how that could be set to /usr/bin; possible some weird script involved with starting emacs? > Why sporadically, though? The emacs test wait 0.1 before writing - I > dunno why, but those waits are fragile and make me nervous about even > keeping the tests for release builds. One thing that might help is to make the wait some global variable amount of time, and various CI/build scenarios could set it to some generous length. > > I guess due to its load, COPR is prone to exposing timing issues. > there are some very slow architectures (e.g. mipsel) on the debian buildds, so I'm a bit surprised we don't see similar issues there. Maybe you are just doing more builds (which is great, obviously). ___ notmuch mailing list -- [email protected] To unsubscribe send an email to [email protected]
