Re: request for clarification on Open Group Base Specifications Issue 7: Canc...

Alexander Terekhov Mon, 12 Jun 2017 02:02:40 -0700

Hi,

> but return ETIMEDOUT and leave the cancellation state pending.


so that you can call pthread_testcancel() upon ETIMEDOUT return (if cancel 

is more important too you than a timeout). 

I don't see what is wrong here.

regards,
alexander.



From:   Dimitri Staessens <dimitri.staess...@ugent.be>
To:     shwares...@aol.com, austin-group-l@opengroup.org
Date:   11.06.2017 09:08
Subject:        Re: request for clarification on Open Group Base 
Specifications Issue 7: Canc...



Hi,
Thank you for your response.
On 06/11/17 00:18, shwares...@aol.com wrote:
I don't see the interface invoked to block the thread is allowed to cancel 
the request completely, simply that the interface may return to reduce 
serialization latencies for the 2 bullet point cases and the cancellation 
honored at the next plausible point in the code path, in accordance with 
explanation of XRAT B.2.9.5, (P3657, L125092-7, C165.pdf), rather than 
handle the cancel immediately before honoring the expiration and perhaps 
leave some shared resources in an inconsistent state. However, application 
code may be written so there is no "next plausible point" because it goes 
into a loop that does not use any of the thread cancellation interfaces, 
so it will look like the request is not being acted upon.

I agree. But what if the next plausible point does the same? And what if 
it is in a loop? This defferal of thread cancellation must only be allowed 
if the cancellation state was not yet set at the time of invocation of the 
cancellation point, but was set at a later point, e.g. when the 
cancellation point was in a suspended state. I think this is how the 
specification should be read, but apparently it can be interpreted 
differently.

To elaborate a bit on the specific case that is now present in glibc 2.25:

The precondition prior to invoking the cancellation point is that the 
cancellation state is set to cancel the thread. The thread invokes a 
specified cancellation point with a timeout, but that timeout has already 
expired. The implementation will not cancel this thread, but return 
ETIMEDOUT and leave the cancellation state pending. This has to be a 
misinterpretation of clause 2 ("a timeout expired"). The reason why this 
has to be an invalid interpretation is that it leads to inconsistent and 
unpredictable behaviour. If the handling of the cancellation 
depends on the timeout value passed to the cancellation point, what value 
will guarantee cancellation? It's so dependent on many factors (The 
implementation, the machine it runs on, the load on that machine, etc) 
that it's probably less predictable than tossing a coin. 

This is the text from previous versions of POSIX:

Whenever a thread has cancelability enabled and a cancellation request has 
been made with that thread as the target, and the thread then calls any 
function that is a cancellation point (such as pthread_testcancel() or 
read()), the cancellation request shall be acted upon before the function 
returns. If a thread has cancelability enabled and a cancellation request 
is made with the thread as a target while the thread is suspended at a 
cancellation point, the thread shall be awakened and the cancellation 
request shall be acted upon. However, if the thread is suspended at a 
cancellation point and the event for which it is waiting occurs before the 
cancellation request is acted upon, it is unspecified whether the 
cancellation request is acted upon or whether the cancellation request 
remains pending and the thread resumes normal execution.

Note that the final sentence of this paragraph, starts with "However, if 
the thread is suspended"...
This seems to have been rewritten in the latest version to leave some room 
for interpretation, but it is absolutely crucial that this remains there 
so that the interpretation taken by the current glibc implementation be 
clearly invalid.

If threading is non-preemptive, an infinite loop of this nature means it 
won't be acted upon at all, as extreme case. What seems to be missing 
there is a requirement that the interfaces that are allowed to resume 
execution check for cancellation requests being active immediately before 
they return, and honor them, in addition to when they begin execution and 
honor them when they want to. 
I think requiring checking the cancellation state before returning would 
be too restrictive for some implementations, and that is the reason for 
the two clauses. The requirement that any cancellation point must honor 
any cancellation request that is pending before invocation is enough. It 
should just be made clear that the clause "It is unspecified whether the 
cancellation request is acted upon if ... a specified timeout expired". 
Cannot have the effect of breaking it.
This has the effect the thread resumes normal execution, as currently 
stated, until the end of the interface invocation.
Thank you so much.

Dimitri
 
In a message dated 6/9/2017 1:53:32 P.M. Eastern Daylight Time, 
dimitri.staess...@ugent.be writes:
Dear,
There is a paragraph in the Base Specifications regarding Cancellation 
Points that seems to leave some room for interpretation, with rather dire 
consequences:
http://pubs.opengroup.org/onlinepubs/9699919799/functions/V2_chap02.html
It concerns the following paragraph:
Whenever a thread has cancelability enabled and a cancellation request has 
been made with that thread as the target, and the thread then calls any 
function that is a cancellation point (such as pthread_testcancel() or 
read()), the cancellation request shall be acted upon before the function 
returns. If a thread has cancelability enabled and a cancellation request 
is made with the thread as a target while the thread is suspended at a 
cancellation point, the thread shall be awakened and the cancellation 
request shall be acted upon. It is unspecified whether the cancellation 
request is acted upon or whether the cancellation request remains pending 
and the thread resumes normal execution if:
The thread is suspended at a cancellation point and the event for which it 
is waiting occurs
A specified timeout expired
before the cancellation request is acted upon.

In the newest glibc implementation (2.25), the clause "It is unspecified 
whether the cancellation request is acted upon if ... a specified timeout 
expired" is taken against the first statement of the paragraph. The new 
implementation of pthread_cond_timedwait() does not act upon a pending 
cancellation request if the abstime (specified using the monotonic clock) 
has already expired.
See the bug report and discussion here:
https://sourceware.org/bugzilla/show_bug.cgi?id=21291
>From the way this paragraph is written, I think the interpretation by the 
developer is, however unpalatable, a valid one. However my interpretation 
is that the first statement (that a cancellation request that is pending 
before any cancellation point is entered, must always be acted upon, 
irrespective of any input to the cancellation point) is non-negitiable and 
the clauses are only valid when there was no pending cancellation request 
at the time of entry into the cancellation point. This would be a much 
more robust interpretation.
Can you please clarify how this should be interpreted?
thank you very much for your assistance,
Dimitri Staessens
Ghent University-imec

Re: request for clarification on Open Group Base Specifications Issue 7: Canc...

Reply via email to