Re: T350-crypto T357-index-decryption: possible race condition?

2023-05-11 Thread David Bremner
Michael J Gruber  writes:

> Building notmuch 0.37 with my usual spec-file as a rawhide-mock build
> (a local chroot for the development "version" of Fedora which will
> become Fedora 39) I see:
> ```
> T350-crypto: Testing PGP/MIME signature verification and decryption
>  PASS   emacs delivery of signed message via fcc
>  PASS   emacs delivery of signed message via fcc and smtp
>  PASS   signed part content-type indexing
>  PASS   signature verification
>  PASS   detection of modified signed contents
>  PASS   corrupted pgp/mime signature
>  PASS   signature verification without full user ID validity
>  PASS   signature verification with signer key unavailable
> ```
> There the suite "hangs" for about 2 minutes, followed by

This sounds suspiciously like "timeout" kicking in and killing a test
that is taking too long. You can set NOTMUCH_TEST_TIMEOUT in the
environment to some smaller/larger number to test this.

The first subtest in T357 is also sending an encrypted message, so it
looks like some bad interaction between gpg and emacs. Maybe you can try
sending an encrypted message from emacs interactively in your chroot
environment.

0) you will first need to run 

   gpg --no-tty --import ./test/openpgp4-secret-key.asc 

this will create a keyring etc... in the chroot

1) You can use the script

./devel/try-emacs-mua -q


2) Copy the following into *scratch* and run with C-j after the last
closing paren (this is just copied from the test suite internals)

(let ((message-send-mail-function (lambda () t))
  (mail-host-address "example.com"))
  (notmuch-mua-mail)
  (message-goto-to)
  (insert "test_su...@notmuchmail.org\nDate: 01 Jan 2000 12:00:00 -")
  (message-goto-subject)
  (insert "My Subject")
  (message-goto-body)
  (insert "a body")
  (mml-secure-message-encrypt)
  (let ((mml-secure-smime-sign-with-sender t)
(mml-secure-openpgp-sign-with-sender t))
(notmuch-mua-send-and-exit)))

You will probably need to answer "c" to create the sent-mail folder at
the prompt.

> ```
> In the end, the suite complains:
> ```
> '/builddir/build/BUILD/notmuch-0.37/test/test-results/T350-crypto'
> does not exist!
> '/builddir/build/BUILD/notmuch-0.37/test/test-results/T357-index-decryption'
> does not exist!
> ```
> At least for T350 this is strange because several subtests ran and
> passed! This indicates a race or a wrong signal trap.

Note that those files are summaries created by test_done, so it's not
that surprising that they are not there, since test_done is not reached
in those files.
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org


T350-crypto T357-index-decryption: possible race condition?

2023-05-11 Thread Michael J Gruber
Hi there,

my regular notmuch test builds recently started to fail, more
concretely: the test suite fails because some subtests are KILLed.
Building notmuch 0.37 with my usual spec-file as a rawhide-mock build
(a local chroot for the development "version" of Fedora which will
become Fedora 39) I see:
```
T350-crypto: Testing PGP/MIME signature verification and decryption
 PASS   emacs delivery of signed message via fcc
 PASS   emacs delivery of signed message via fcc and smtp
 PASS   signed part content-type indexing
 PASS   signature verification
 PASS   detection of modified signed contents
 PASS   corrupted pgp/mime signature
 PASS   signature verification without full user ID validity
 PASS   signature verification with signer key unavailable
```
There the suite "hangs" for about 2 minutes, followed by
```
FATAL: /builddir/build/BUILD/notmuch-0.37/test/T350-crypto.sh:
interrupted by signal 15
```
It proceeds until
```
T357-index-decryption: Testing indexing decrypted mail
```
and hangs again for about 2 minutes, followed by
```
FATAL: /builddir/build/BUILD/notmuch-0.37/test/T357-index-decryption.sh:
interrupted by signal 15
```
In the end, the suite complains:
```
'/builddir/build/BUILD/notmuch-0.37/test/test-results/T350-crypto'
does not exist!
'/builddir/build/BUILD/notmuch-0.37/test/test-results/T357-index-decryption'
does not exist!
```
At least for T350 this is strange because several subtests ran and
passed! This indicates a race or a wrong signal trap.

The same problem happens with notmuch 0.37 in Fedora's infrastructure
(koji rawhide, e.g.
https://koji.fedoraproject.org/koji/taskinfo?taskID=101014703).

Curiously, everything seems to work with notmuch 0.37 in Fedora 38,
which is the current release, in both koji and locally in mock.

BUT: In Fedora's secondary test-bed (copr) and with notmuch from git,
these kind of errors happen on released fedora versions, too. This was
kind of erratic, but I suspected something related to emacs 28 and
test timeouts. So I increased the timeout in the test lisp lib (see
below), hoping for the better, but getting the worse, at least
deterministically worse: With this change, the test suite fails
reliably (the two mentioned above plus T315-emacs-tagging) on all
Fedoras (with Emacs 28) and passes on epel (with Emacs 27), see for
example:
https://copr.fedorainfracloud.org/coprs/mjg/notmuch-git/build/5908525/

Now, emacs is not the only difference, and the complete test result
directory disappearing is still strange, and really all that is
strange. Help please ;)

(There is another problem related to Python 3.12 which I'll address
separately - rawhide still carries 3.11.)

```
---
 test/test-lib.el | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/test/test-lib.el b/test/test-lib.el
index 79a9d4d6..39ade9b9 100644
--- a/test/test-lib.el
+++ b/test/test-lib.el
@@ -39,7 +39,7 @@
 (defun notmuch-test-wait ()
   "Wait for process completion."
   (while (get-buffer-process (current-buffer))
-(accept-process-output nil 0.1)))
+(accept-process-output nil 120)))

 (defun test-output ( filename)
   "Save current buffer to file FILENAME.  Default FILENAME is OUTPUT."
```
___
notmuch mailing list -- notmuch@notmuchmail.org
To unsubscribe send an email to notmuch-le...@notmuchmail.org