On 27.07.2025 13:24, Rick McGuire wrote:
If an exception happens, the detachThread() never happens, which would leave the interpreter in an inconsistent state. The best fix would be to write a small class with a destructor method that will do the detach thread when the variable leaves the scope.

Thank you, Rick.

How about:

   Index: interpreter/runtime/Interpreter.cpp
   ===================================================================
   --- interpreter/runtime/Interpreter.cpp      (revision 12998)
   +++ interpreter/runtime/Interpreter.cpp      (working copy)
   @@ -235,7 +235,10 @@
             // to shut down so that the package manager can unload
             // the libraries (it needs to pass a RexxThreadContext
             // pointer out to package unloaders, if they are defined)
   -        InstanceBlock instance;
   +        InstanceBlock tmpInstance;
   +        tmpInstance.instance->InterpreterInstance::attachThread();
   +        InstanceDetacher instDet = InstanceDetacher(tmpInstance.instance);
   +
             // run whatever uninits we can before we start releasing the 
libraries
             memoryObject.lastChanceUninit();
@@ -487,6 +490,28 @@ /**
   + * Make sure that detachThread() gets carried out for the supplied 
InterpreterInstance.
   + *
   + * @param i InterpreterInstance
   + */
   +InstanceDetacher::InstanceDetacher(InterpreterInstance *i)
   +{
   +    // Get an instance.
   +    instance = i;
   +}
   +
   +/**
   + * Carry out detachThread() for the instance, if going out of scope.
   + */
   +InstanceDetacher::~InstanceDetacher()
   +{
   +    // detach from the instance
   +    instance->InterpreterInstance::detachThread();
   +}
   +
   +
   +
   +/**
      * Decode a condition directory into a easier to use
      * structure form for a native code user.  This breaks
      * the directory into its component pieces, including
   Index: interpreter/runtime/Interpreter.hpp
   ===================================================================
   --- interpreter/runtime/Interpreter.hpp      (revision 12998)
   +++ interpreter/runtime/Interpreter.hpp      (working copy)
   @@ -248,4 +248,14 @@
     };
+class InstanceDetacher
   +{
   +public:
   +    InstanceDetacher(InterpreterInstance *i);
   +    ~InstanceDetacher();
   +
   +    InterpreterInstance  *instance;    // interpreter instance
   +};
   +
   +
     #endif

Any comments, suggestions?

---rony




On Sun, Jul 27, 2025 at 6:51 AM Rony G. Flatscher <rony.flatsc...@wu.ac.at> 
wrote:

    It seems that the following patch fixes the reported crash:

        Index: interpreter/runtime/Interpreter.cpp
        ===================================================================
        --- interpreter/runtime/Interpreter.cpp (revision 12998)
        +++ interpreter/runtime/Interpreter.cpp (working copy)
        @@ -235,11 +235,15 @@
                  // to shut down so that the package manager can unload
                  // the libraries (it needs to pass a RexxThreadContext
                  // pointer out to package unloaders, if they are defined)
        -        InstanceBlock instance;
        +        InstanceBlock tmpInstance;
        +        tmpInstance.instance->InterpreterInstance::attachThread();
        +
                  // run whatever uninits we can before we start releasing the 
libraries
                  memoryObject.lastChanceUninit();

                  PackageManager::unload();
        +
        +        tmpInstance.instance->InterpreterInstance::detachThread();
              }
              catch (ActivityException)
              {

    Any comments, suggestions?

    ---rony


    On 23.12.2023 18:03, Rony G. Flatscher wrote:

    Since last August a crash got reported in 
<https://sourceforge.net/p/oorexx/bugs/1872/>
    <https://sourceforge.net/p/oorexx/bugs/1872/> which can be easily 
reproduced by following the
    directions there. Using the supplied simple C++-only program will work 
flawlessly if invoked
    simply as

        testTerminate.exe

    it will run flawlessly using 250 Rexx interpreter instances to execute a 
simple script
    (issuing a single dot on stdout) and then terminate it on the same thread.

    However if supplying an argument to that test program (source got uploaded 
with #1872 as
    well), then the termination of the Rexx interpreter instance occurs on a 
separate thread and
    will cause the crash upon terminating the second Rexx interpreter instance 
on a separate thread:

        testTerminate.exe 
anyArgumentSufficesToRunTheRIIOnSeparateThreadAndCauseTheCrash

    The relevant Windows code is:

        ... cut ...
        // 
from<https://stackoverflow.com/questions/1981459/using-threads-in-c-on-windows-simple-example>
 
<https://stackoverflow.com/questions/1981459/using-threads-in-c-on-windows-simple-example>,
        // 
also<https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createthread>
 
<https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-createthread>
        // #include <windows.h>

        DWORD WINAPI ThreadFunc(void* data) {
           // Do stuff.  This will be the first function called on the new 
thread.
           // When this function returns, the thread goes away.  See MSDN for 
more details.

             RexxInstance *instance = (RexxInstance *) data;
             // uncommenting the following statement, makes it run
             *// fprintf(stderr, "ThreadFunc(), arrived: terminating instance 
[%p]\n",
        instance);fflush(stderr); *
             // Now wait for the interpreter to terminate and we are done.
             instance->Terminate();

             // fprintf(stderr, "ThreadFunc(), leaving: terminating instance 
[%p]\n", instance);fflush(stderr);

           return 0;

        ... cut ...

                 if (argc>1) // if argument given terminate on separate thread
                 {
                     void *data = interpreter;
                     HANDLE thread = CreateThread(NULL, 0, ThreadFunc, data, 0, 
NULL);
                     if (thread)
                     {
                         fprintf(stderr, "Created [%d], thread [%p] for terminating 
instance [%p]\n", i, thread, interpreter);fflush(stderr);
                     }

                 }
                 else        // terminate on main thread
                 {
                     // Now wait for the interpreter to terminate and we are 
done.
                     interpreter->Terminate();
                 }

        ... cut ...

    Uncommenting the blue fprintf statement makes the crash go away *most* of 
the time such that
    all 250 Rexx interpreter instances can be terminated on a separate thread. 
Reducing that to
    just fflush(stderr) makes it go away *most* of the time as well (whereas 
using fflush(stdout)
    instead does not have any effect. So accessing stderr in that thread 
somehow influences the
    behaviour).

    ---

    Originally this crash occurred in the context of BSF4ooRexx and I reported 
it in
    <https://sourceforge.net/p/oorexx/feature-requests/806/>
    <https://sourceforge.net/p/oorexx/feature-requests/806/>. It seems that 
Rick thought it had
    to do with the code in BSF4ooRexx and not with ooRexx. Eventually this 
thread led to a new,
    small C++ (supplied with #1872 above) only (only creates a RII, runs a 
simple Rexx script,
    terminates the RII) showcase in which no external dependency could be made 
responsible for
    the crash, such that it becomes clear that the cause lies in ooRexx itself.

    As the crash occurs rather early, if using "testTerminate.exe someArg" I 
had hoped that some
    of the C++ gurus would be able to locate (and hopfeully fix) the cause 
relatively quickly
    (realizing that multithreaded bugs can be extremely hard to trace down and 
to fix).
    Unfortunately, this has not happened to this day (after almost 16 months). 
As this is a
    showstopper and inhibits using this great ooRexx feature, namely being able 
to create any
    number of different Rexx interpreter instances (each having its own .local) 
in a process and
    terminate them.

    Therefore I kindly request any of the C++ experts to look into this matter 
if possible. Maybe
    the upcoming vacation time allows for a few time slots to use for 
analyzing, finding and
    hopefully fixing the cause of this showstopper crash!

    ---rony

    P.S.: Have simplified the test code a little bit, also allowing the test 
program to run with
    all three reported scenarios; it is enclosed as (also slightly stripped 
down)
    "testTerminate2.cpp" and "Makefile2.windows" ("nmake /f Makefil2.windows" 
will create
    testTerminate2.exe) which after successful compilation can be invoked as:

      * testTerminate2.exe

          o no arguments: will create and terminate all 250 Rexx interpreter 
instances (RII) on
            the same (main) thread without any crash

      * testTerminate2.exe oneArg

          o any single argument will cause the crash to always occur while 
terminating the
            *second* RII on a separate thread (as if terminating the previous 
RII on a separate
            thread left something in an unbalanced state)

      * testTerminate2.exe oneArg secondArg

          o any two arguments will cause fflush(stderr) in the terminating 
thread to be issued
            and as a result most of the time all 250 Rexx interpreter instances 
can be terminated
            on a separate thread without a crash; running it a couple of times 
may however cause
            that crash to occur at different RIIs

    Terminate() will kick off the garbage collector and the crash seems to 
occur during its run.

    P.P.S.: Would it be possible to have the termination of a RII carried out 
on the same thread
    as its creation (which seems to run stably) by ooRexx itself (e.g. having a 
specific thread
    to create and to terminate RIIs)?

    This may need a function to learn about whether a RII is currently 
running/blocked and only
    terminate it on the creation thread if not in use so not to block the 
creation thread. Maybe
    also (an asynchroneous?) a function to force termination of a RII would be 
helpful in such a
    scenario, maybe with an option that indicates whether running/blocking Rexx 
code on that RII
    should be forcefully halted or not.

_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Reply via email to