On 14/07/2023 14:04, Jon Turney wrote:
On 13/07/2023 19:53, Corinna Vinschen wrote:
normally after 10 seconds. (See the commentary in pthread::cancel() in
thread.cc, where it checks if the target thread is inside the kernel,
and silently converts the cancellation into a deferred one)

Nevertheless, I think this is ok to do.  The description of pthread_cancel
contains this:

   Asynchronous cancelability means that the thread can be canceled at
   any time (usually immediately, but the system does not guarantee this).

And

   The above steps happen asynchronously with respect to the
   pthread_cancel() call; the return status of pthread_cancel() merely
   informs the caller whether the cancellation request was successfully
   queued.

So any assumption *when* the cancallation takes place is may be wrong.

Yeah.

I think the flakiness is when we happen to try to async cancel while in the Windows kernel, which implicitly converts to a deferred cancellation, but there are no cancellation points in the the thread, so it arrives at pthread_exit() and returns a exit code other than PTHREAD_CANCELED.

I did consider making the test non-flaky by adding a final call to pthread_testcancel(), to notice any failed async cancellation which has been converted to a deferred one.

But then that is just the same as the deferred cancellation tests, and confirms the cancellation happens, but not that it's async, which is part of the point of the test.

I guess this could also check that not all of the threads ran for all 10 seconds, which would indicate that at least some of them were cancelled asynchronously.

I wrote this, attached, which does indeed make this test more reliable.

You point is well made that this is making assumptions about how quickly async cancellation works that are not required by the standard

(It would be a valid, if strange implementation, if async cancellation always took 10 seconds to take effect, which this test assumes isn't the case)

Perhaps there is a better way to write a test that async cancellation works in the absence of cancellation points, but it eludes me...
From d0023fa3ea1e8f29e80d473ab13d8200bdd2dc3a Mon Sep 17 00:00:00 2001
From: Jon Turney <jon.tur...@dronecode.org.uk>
Date: Sat, 15 Jul 2023 17:57:43 +0100
Subject: [PATCH] Cygwin: testsuite: Make cancel3 and cancel5 more robust

Despite our efforts, sometimes the async cancellation gets deferred.

Notice this by calling pthread_testcancel(), and then try to work out if
async cancellation was ever successful by checking if all threads ran
for the full 10 seconds, or if some were stopped early.
---
 winsup/testsuite/winsup.api/pthread/cancel3.c | 16 +++++++++++++++-
 winsup/testsuite/winsup.api/pthread/cancel5.c | 14 ++++++++++++++
 2 files changed, 29 insertions(+), 1 deletion(-)

diff --git a/winsup/testsuite/winsup.api/pthread/cancel3.c 
b/winsup/testsuite/winsup.api/pthread/cancel3.c
index 07feb7c9b..075f052cc 100644
--- a/winsup/testsuite/winsup.api/pthread/cancel3.c
+++ b/winsup/testsuite/winsup.api/pthread/cancel3.c
@@ -92,6 +92,9 @@ mythread(void * arg)
        }
     }
 
+  /* Notice if asynchronous cancel got deferred */
+  pthread_testcancel();
+
   return result;
 }
 
@@ -101,6 +104,7 @@ main()
   int failed = 0;
   int i;
   pthread_t t[NUMTHREADS + 1];
+  int ran_to_completion = 0;
 
   assert((t[0] = pthread_self()) != NULL);
 
@@ -130,7 +134,7 @@ main()
    * Standard check that all threads started.
    */
   for (i = 1; i <= NUMTHREADS; i++)
-    { 
+    {
       if (!threadbag[i].started)
        {
          failed |= !threadbag[i].started;
@@ -166,9 +170,19 @@ main()
                  threadbag[i].count,
                  result);
        }
+
+      if (threadbag[i].count >= 10)
+       ran_to_completion++;
+
       failed = (failed || fail);
     }
 
+  if (ran_to_completion >= 10)
+    {
+      fprintf(stderr, "All threads ran to completion, async cancellation never 
happened\n");
+      failed = TRUE;
+    }
+
   assert(!failed);
 
   /*
diff --git a/winsup/testsuite/winsup.api/pthread/cancel5.c 
b/winsup/testsuite/winsup.api/pthread/cancel5.c
index 999b3c95c..23c02afe4 100644
--- a/winsup/testsuite/winsup.api/pthread/cancel5.c
+++ b/winsup/testsuite/winsup.api/pthread/cancel5.c
@@ -93,6 +93,9 @@ mythread(void * arg)
        }
     }
 
+  /* Notice if asynchronous cancel got deferred */
+  pthread_testcancel();
+
   return result;
 }
 
@@ -102,6 +105,7 @@ main()
   int failed = 0;
   int i;
   pthread_t t[NUMTHREADS + 1];
+  int ran_to_completion = 0;
 
   for (i = 1; i <= NUMTHREADS; i++)
     {
@@ -165,9 +169,19 @@ main()
                  threadbag[i].count,
                  result);
        }
+
+      if (threadbag[i].count >= 10)
+       ran_to_completion++;
+
       failed = (failed || fail);
     }
 
+  if (ran_to_completion >= 10)
+    {
+      fprintf(stderr, "All threads ran to completion, async cancellation never 
happened\n");
+      failed = TRUE;
+    }
+
   assert(!failed);
 
   /*
-- 
2.39.0

Reply via email to