Re: T460: new sporadic failures with emacs 29

2023-09-01 Thread David Bremner
Michael J Gruber  writes:

>
> Yes, that's why I wrote "ignore". Something like NOTMUCH_IGNORE_TESTS
> which runs the test, outputs the diff on fail, but "succeeds" without
> counting towards pass/fail, and reports the number of ignored
> pass/fail separately - basically "known_broken" without the
> "known/expectation".
>
> I just don't know whether it's worth it. Other folks disable a whole
> test suite when they want to get a package update going ...

I guess it would be mainly interesting for distro packagers, or I guess
people who wanted to run come kind of CI.

Without looking at the code, I think just ignoring the return value
would be relatively easy, while keeping track of ignored tests might be
a bit more work. Maybe Tomi has a clearer idea / finds this a fun
problem.

d
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-09-01 Thread Michael J Gruber
Am Do., 31. Aug. 2023 um 17:17 Uhr schrieb David Bremner :
>
> Michael J Gruber  writes:
>
> >
> > I still get those issues. OTOH, skipping T460.14 did not show any
> > adverse side effects. So I'll do that for emacs29.
> > I might be nice to mark some tests ignored rather than skipped so that
> > we notice when they do not fail sporadically any more. That is, *if*
> > we look at the output of a passing test suite ...
> >
>
> It is possible to selectively mark tests as broken, but it requires
> patching the test suite, and it sets a failing exit code if those tests
> start passing.

Yes, that's why I wrote "ignore". Something like NOTMUCH_IGNORE_TESTS
which runs the test, outputs the diff on fail, but "succeeds" without
counting towards pass/fail, and reports the number of ignored
pass/fail separately - basically "known_broken" without the
"known/expectation".

I just don't know whether it's worth it. Other folks disable a whole
test suite when they want to get a package update going ...

Michael
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-31 Thread David Bremner
Michael J Gruber  writes:

>
> I still get those issues. OTOH, skipping T460.14 did not show any
> adverse side effects. So I'll do that for emacs29.
> I might be nice to mark some tests ignored rather than skipped so that
> we notice when they do not fail sporadically any more. That is, *if*
> we look at the output of a passing test suite ...
>

It is possible to selectively mark tests as broken, but it requires
patching the test suite, and it sets a failing exit code if those tests
start passing. 
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-31 Thread Michael J Gruber
Am Do., 31. Aug. 2023 um 15:16 Uhr schrieb David Bremner :
>
> Michael J Gruber  writes:
>
> > Am Sa., 26. Aug. 2023 um 16:41 Uhr schrieb David Bremner 
> > :
> >>
> >> Michael J Gruber  writes:
> >>
> >> >
> >> > I tried the current 0.38rc1 on COPR, and unfortunately I get the same
> >> > T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time,
> >> > out of 35 buildroots).
> >> > Did you get your fails with emacs 29 only, or with earlier emacs?
> >>
> >> I only tested emacs 29; it would be some different incantation to
> >> semi-disable native compilation for emacs 28.x. Are you seeing those
> >> same failures (where emacs attempting to write into /usr/bin) on older
> >> emacs?
> >
> > No, I see them only on Fedora rawhide/ELN and Fedora 39, but not on
> > the current release 38 or earlier. Emacs 29/28 is one difference and
> > an obvious guess as the cause, but it could be dtach or whatnot.
>
> Hmm. I just built 200 times in sbuild (chroot) so I guess I am no longer
> able to reproduce the issue on Debian, fwiw.

I still get those issues. OTOH, skipping T460.14 did not show any
adverse side effects. So I'll do that for emacs29.
I might be nice to mark some tests ignored rather than skipped so that
we notice when they do not fail sporadically any more. That is, *if*
we look at the output of a passing test suite ...

Michael
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-31 Thread David Bremner
Michael J Gruber  writes:

> Am Sa., 26. Aug. 2023 um 16:41 Uhr schrieb David Bremner :
>>
>> Michael J Gruber  writes:
>>
>> >
>> > I tried the current 0.38rc1 on COPR, and unfortunately I get the same
>> > T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time,
>> > out of 35 buildroots).
>> > Did you get your fails with emacs 29 only, or with earlier emacs?
>>
>> I only tested emacs 29; it would be some different incantation to
>> semi-disable native compilation for emacs 28.x. Are you seeing those
>> same failures (where emacs attempting to write into /usr/bin) on older
>> emacs?
>
> No, I see them only on Fedora rawhide/ELN and Fedora 39, but not on
> the current release 38 or earlier. Emacs 29/28 is one difference and
> an obvious guess as the cause, but it could be dtach or whatnot.

Hmm. I just built 200 times in sbuild (chroot) so I guess I am no longer
able to reproduce the issue on Debian, fwiw. 
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-26 Thread Michael J Gruber
Am Sa., 26. Aug. 2023 um 16:41 Uhr schrieb David Bremner :
>
> Michael J Gruber  writes:
>
> >
> > I tried the current 0.38rc1 on COPR, and unfortunately I get the same
> > T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time,
> > out of 35 buildroots).
> > Did you get your fails with emacs 29 only, or with earlier emacs?
>
> I only tested emacs 29; it would be some different incantation to
> semi-disable native compilation for emacs 28.x. Are you seeing those
> same failures (where emacs attempting to write into /usr/bin) on older
> emacs?

No, I see them only on Fedora rawhide/ELN and Fedora 39, but not on
the current release 38 or earlier. Emacs 29/28 is one difference and
an obvious guess as the cause, but it could be dtach or whatnot.

I get the same failures with notmuch 0.37+your patch on koji now
(rawhide, f39; not f38), sporadically.

I'm confident it's only in the test suite, so I can disable that test
on Fedora for the release build. (Will have to test whether the
failures creep up somewhere else then.)

Michael
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-26 Thread David Bremner
Michael J Gruber  writes:

>
> I tried the current 0.38rc1 on COPR, and unfortunately I get the same
> T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time,
> out of 35 buildroots).
> Did you get your fails with emacs 29 only, or with earlier emacs?

I only tested emacs 29; it would be some different incantation to
semi-disable native compilation for emacs 28.x. Are you seeing those
same failures (where emacs attempting to write into /usr/bin) on older
emacs?

> There's also one patch I want to send out before release, hopefully in
> a minute or two ;-)

OK
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-26 Thread Michael J Gruber
Am Sa., 26. Aug. 2023 um 00:28 Uhr schrieb David Bremner :
>
> Michael J Gruber  writes:
>
> > It took more runs to get some fails now, and archs vary, so I still
> > think its a time out. And no way to get it locally so far.
>
> I can duplicate it locally about once every 40 runs of the complete test
> suite.
>
> > ENOLISP (for me) but could it be the case that notmuch-test-wait can
> > abort its while loop too early if the first buffer write takes longer
> > than the timeout, or if some other process writes (because the process
> > parameter is nil)? Is something different for emacs 29 in this regard?
> > Any clues from sbuild?
>
> Can you try the attached patch? It needs more testing, but I did get 140
> runs of the test suite without an error.

I tried the current 0.38rc1 on COPR, and unfortunately I get the same
T460 failure (fedora-eln-aarch64 and fedora-rawhide-x86_64 this time,
out of 35 buildroots).
Did you get your fails with emacs 29 only, or with earlier emacs?

Trying with 0.37+patches on KOJI right now.

There's also one patch I want to send out before release, hopefully in
a minute or two ;-)

Michael
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-25 Thread David Bremner
Michael J Gruber  writes:

> It took more runs to get some fails now, and archs vary, so I still
> think its a time out. And no way to get it locally so far.

I can duplicate it locally about once every 40 runs of the complete test
suite.

> ENOLISP (for me) but could it be the case that notmuch-test-wait can
> abort its while loop too early if the first buffer write takes longer
> than the timeout, or if some other process writes (because the process
> parameter is nil)? Is something different for emacs 29 in this regard?
> Any clues from sbuild?

Can you try the attached patch? It needs more testing, but I did get 140
runs of the test suite without an error. 

diff --git a/test/test-lib.el b/test/test-lib.el
index 236dd99e..709c3b36 100644
--- a/test/test-lib.el
+++ b/test/test-lib.el
@@ -22,6 +22,10 @@
 
 ;;; Code:
 
+(setq native-comp-jit-compilation nil)
+(setq native-comp-speed -1)
+(setq native-comp-async-jobs-number 1)
+
 (require 'cl-lib)
 
 ;; Ensure that the dynamic variables that are defined by this library
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-24 Thread Michael J Gruber
Am Do., 24. Aug. 2023 um 17:10 Uhr schrieb David Bremner :
>
> David Bremner  writes:
>
> > I just saw this when running in debian's "sbuild" isolated build
> > environment. So my current guess is that this has to do with HOME
> > pointing somewhere nonexistent. Is that also the case in COPR?
> >
> > d
>
> I realized that we override HOME inside the tests anyway, so emacs
> should think there is some writable HOME in any case. I did notice that
> the tests trigger a bunch of emacs native compilation (because the
> caching happens in the temporary $HOME, which gets blown away every
> time).

Also, $HOME is set in all my build envs (pass or fail), and
permissions are the same. Bummer.

It took more runs to get some fails now, and archs vary, so I still
think its a time out. And no way to get it locally so far.

ENOLISP (for me) but could it be the case that notmuch-test-wait can
abort its while loop too early if the first buffer write takes longer
than the timeout, or if some other process writes (because the process
parameter is nil)? Is something different for emacs 29 in this regard?
Any clues from sbuild?

Michael
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-24 Thread David Bremner
David Bremner  writes:

> I just saw this when running in debian's "sbuild" isolated build
> environment. So my current guess is that this has to do with HOME
> pointing somewhere nonexistent. Is that also the case in COPR?
>
> d

I realized that we override HOME inside the tests anyway, so emacs
should think there is some writable HOME in any case. I did notice that
the tests trigger a bunch of emacs native compilation (because the
caching happens in the temporary $HOME, which gets blown away every
time). 
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-24 Thread Michael J Gruber
Am Do., 24. Aug. 2023 um 16:01 Uhr schrieb David Bremner :
>
> David Bremner  writes:
>
> > Michael J Gruber  writes:
> >
> >> [email protected]
> >> -http://notmuchmail.org/mailman/listinfo/notmuch
> >> *ERROR*: Opening output file: Permission denied, /usr/bin/OUTPUT
> >>  PASS   Stash id
> >> ```
> >>
> >> That "/usr/bin/OUTPUT" looks strange and smells like a mis-expanded
> >> variable.
> >
> > Yes, that's pretty weird. The only writes to "OUTPUT" are relative to
> > emacs default-directory. Not sure how that could be set to /usr/bin;
> > possible some weird script involved with starting emacs?
>
> I just saw this when running in debian's "sbuild" isolated build
> environment. So my current guess is that this has to do with HOME
> pointing somewhere nonexistent. Is that also the case in COPR?
>

I encountered this on koji (the main Fedora infra), too, and am trying
with an increased wait (1 rather than 0.1) right now. Dunno by how
much this increases test suite run times.

HOME could be an issue only if some builder VMs are set-up
differently, I guess? They shouldn't be (which does not necessarily
mean they aren't).

Michael
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-24 Thread David Bremner
David Bremner  writes:

> Michael J Gruber  writes:
>
>> [email protected]
>> -http://notmuchmail.org/mailman/listinfo/notmuch
>> *ERROR*: Opening output file: Permission denied, /usr/bin/OUTPUT
>>  PASS   Stash id
>> ```
>>
>> That "/usr/bin/OUTPUT" looks strange and smells like a mis-expanded
>> variable.
>
> Yes, that's pretty weird. The only writes to "OUTPUT" are relative to
> emacs default-directory. Not sure how that could be set to /usr/bin;
> possible some weird script involved with starting emacs? 

I just saw this when running in debian's "sbuild" isolated build
environment. So my current guess is that this has to do with HOME
pointing somewhere nonexistent. Is that also the case in COPR?

d
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]


Re: T460: new sporadic failures with emacs 29

2023-08-24 Thread David Bremner
Michael J Gruber  writes:

> [email protected]
> -http://notmuchmail.org/mailman/listinfo/notmuch
> *ERROR*: Opening output file: Permission denied, /usr/bin/OUTPUT
>  PASS   Stash id
> ```
>
> That "/usr/bin/OUTPUT" looks strange and smells like a mis-expanded
> variable.

Yes, that's pretty weird. The only writes to "OUTPUT" are relative to
emacs default-directory. Not sure how that could be set to /usr/bin;
possible some weird script involved with starting emacs? 

> Why sporadically, though? The emacs test wait 0.1 before writing - I
> dunno why, but those waits are fragile and make me nervous about even
> keeping the tests for release builds.

One thing that might help is to make the wait some global variable
amount of time, and
various CI/build scenarios could set it to some generous length.


>
> I guess due to its load, COPR is prone to exposing timing issues.
>

there are some very slow architectures (e.g. mipsel) on the debian
buildds, so I'm a bit surprised we don't see similar issues there.
Maybe you are just doing more builds (which is great, obviously).
___
notmuch mailing list -- [email protected]
To unsubscribe send an email to [email protected]