Re: One more review.

Marc Nieper-Wißkirchen Wed, 07 Sep 2022 13:55:57 -0700

Am Mi., 7. Sept. 2022 um 21:02 Uhr schrieb John Cowan <[email protected]>:

>
>
> On Wed, Sep 7, 2022 at 12:12 PM Marc Nieper-Wißkirchen <
> [email protected]> wrote:
>
>
>> Yes, data structures whose locks are in an abandoned state are
>>> potentially inconsistent and if they are the best one can do with them is
>>> to have them GC'ed.  But this does not necessarily mean that the global
>>> state of a program becomes inconsistent, does it?
>>>
>>
> It means that the future behavior of your program is completely
> unpredictable modulo special cases.
>

Unpredictable if one terminates a random thread, but I still don't see why
it would be unpredictable if only certain threads are terminated.  Think of
a Scheme interpreted and the REPL executes a user-entered expression in a
thread.  Typing CTRL-C should terminate the evaluation (let's forget for a
moment that this usually goes over a signal handler because it doesn't
matter).  A Scheme interpreter can be written in a way so that terminating
the thread does not leave the REPL and the rest of the system in an
inconsistent state.  This is at least the case with existing Scheme
implementations, I think.

In fact, this is possibly the best use case for `thread-terminate!` as it
allows a user to write a useful REPL themselves.  By the way, this is an
argument against what I proposed under the name "critical-section" or
"atomic", so let's drop this.

>   I've probably referred to this before, but see <
> https://docs.oracle.com/javase/8/docs/technotes/guides/concurrency/threadPrimitiveDeprecation.html>
> for explanations of why tampering with threads from outside the thread is
> unsafe in the general case.  To quote the most important part:
>
> If any of the objects previously protected by these monitors were in an
> inconsistent state, other threads may now view these objects in an
> inconsistent state. Such objects are said to be *damaged*. When threads
> operate on damaged objects, arbitrary behavior can result. This behavior
> may be subtle and difficult to detect, or it may be pronounced. Unlike
> other unchecked exceptions, ThreadDeath kills threads silently; thus, the
> user has no warning that his program may be corrupted. The corruption can
> manifest itself at any time after the actual damage occurs, even hours or
> days in the future.
>
> Using atomic operations, one can also program the data structures in a way
>> so that they can recover from an abandoned lock.
>>
>
> The data structure may recover, but the code that has already relied on
> the broken data structure cannot.  If your sequence data structure has
> become circular when it was never expected to be, your thread that is
> trying to process it may go into an infinite loop processing what is
> otherwise garbage.  When a time structure of hours-minutes-seconds was
> being updated from 00:00:59 to 00:01:00 and is terminated leaving 00:01:59,
> whatever thread depends on the time concludes that it has missed its
> deadline by a whole minute and panics the entire system to prevent further
> damage.  And so on.
>

The time structure is obviously used by several threads, so it should be
protected by a mutex.  The modifying thread locks the mutex and begins to
update the value.  If it is then terminated, the lock is left abandoned.
Another thread trying to read the time will try to lock the mutex.  But
then an abandoned mutex exception is raised, signaling that the time value
is possibly incorrect.  So the reading thread would not mistakenly assume
that already a minute has passed and the system is still in a consistent
state (the exception can possibly propagate).

If the time structure is still crucial even after the forced termination of
any thread, the protocol could be as follows:  The time structure comes
with a shadow field and an atomic flag.  The mutex is locked when it is
updated by a thread.  Then the shadow field is populated with the updated
current time.  Afterward, the atomic flag is raised.  Then the time is
copied from the shadow field.  Finally, the atomic flag is lowered and the
mutex is unlocked.

When a reading thread tries to lock the mutex but sees an abandoned one, it
checks the state of the atomic flag.  If it is raised, the shadow field
holds a consistent time object (and would be copied onto the actual time).
If it is lowered, the actual time structure is consistent.

All that is needed to make this feasible is that modifications of single
locations in the store are atomic (which is not hard to achieve in a Scheme
system).

In other words, using abandoned mutexes, one can ensure that whatever
thread accesses a data structure finishes the work left over by a
forcefully terminated thread.

Marc (in CC) surely can say much more about this, but I thought that the
idea of abandoned mutexes in SRFI 18 is there so that `terminate-thread!`
can be handled gracefully.

Marc

Re: One more review.

Reply via email to