Hi,

 

I’m revisiting this because although my patch fixed the problem and did not 
show up any new ones, strictly speaking there is one more thing that needs to 
be done for correctness and future proofing

 

The patch as it stands carries out an extra lock release in 
InterpreterInstance::terminate whether it needs to or not just to deal with the 
case where an extra one is actually needed because a new activity was created. 
For cases where it is not needed this extra unlock theoretically  leads to the 
lock count becoming negative. The locks we currently use are forgiving of and 
ignore this but it is still wrong, and  were we ever to switch to using 
different  locks (e.g. C++ recursive_mutexes – which I *might* have tested 
locally) or should future operating system releases be less forgiving this 
imbalance could causes deadlocks or other errors when interpreter instances 
shutdown “normally” (i.e. without the extra activity creation)

 

The attached patch adds an “activity created”  flag in a few places so that 
InterpreterInstance::terminate can determine when a new activity was actually 
created and only carry out the extra unlock when it was.

 

I’ve added this to the ticket as well for completeness.

 

Kind Regards,

Dom

 

 

 

 

From: Rony G. Flatscher <[email protected]> 
Sent: 25 August 2025 14:31
To: [email protected]
Subject: Re: [Oorexx-devel] No, only the hang bug left and that can be solved 
with Dom Wise's patch (Re: Maybe two different bugs? A question ad fixing the 
hang

 

Judging from the results on Jenkins this patch did not cause any failures, 
hence marking the reported bug (https://sourceforge.net/p/oorexx/bugs/2030/) as 
being fixed.

---rony

 

On 24.08.2025 11:02, Rony G. Flatscher wrote:

On 23.08.2025 23:30, [email protected] <mailto:[email protected]>  
wrote:

Attached is the diff for the full fix (from clean svn 13006 ) which only nudges 
the dispatch queue once. 

+1

Thank you very much, Dom, kudos!

Tested it against the ooRexx test suite on my Windows machine which passes. 

Also tested it against the JavaFX ooRexx application 
(BSF4ooRexx850\samples\JavaFX\MainApp.rex) under Java 8LTS, 17LTS and 24, which 
now thankfully works again, which is really *great*!

Will commit your patch and we will see via the Jenkins managed operating 
systems how the ooRexx test suite fares on all of them.

Cheers

---rony

  

 

From:  <mailto:[email protected]> [email protected]  
<mailto:[email protected]> <[email protected]> 
Sent: 23 August 2025 21:49
To: 'Open Object Rexx Developer Mailing List'  
<mailto:[email protected]> <[email protected]>
Subject: RE: [Oorexx-devel] No, only the hang bug left and that can be solved 
with Dom Wise's patch (Re: Maybe two different bugs? A question ad fixing the 
hang

 

Also changed in Activity.hpp (missed this earlier)

 

-+ void exitCurrentThread(bool dispatch = true);

    

I’ll put together a proper diff

 

From:  <mailto:[email protected]> [email protected] < 
<mailto:[email protected]> [email protected]> 
Sent: 23 August 2025 20:24
To: 'Open Object Rexx Developer Mailing List' < 
<mailto:[email protected]> [email protected]>
Subject: RE: [Oorexx-devel] No, only the hang bug left and that can be solved 
with Dom Wise's patch (Re: Maybe two different bugs? A question ad fixing the 
hang

 

Hi Rony,

 

Nice work! 

 

As regards to my possible fix I was thinking that it might be a good idea to 
modify it slightly with a few additional small changes.

 

What I’m thinking is that ActivityManager::releaseAccess should have a flag 
added which determines whether or not to call dispatchNext(). This is because 
in  InterpreterInstance::terminate, calling  exitCurrentThread() will 
ultimately call this this and so will the additional “fallback” 
ActivityManager::releaseAccess  call made later on. It looks like 
ActivityManager::dispatchNext removes the head of the activity queue and tries 
to set it running. If this is called twice I’m not sure whether this would 
cause problems.

 

What might be a good idea is for Activity::exitCurrentThread to have an 
optional bool “dispatch” parameter, set to false in this one instance, along 
with Activity::releaseAccess which it calls, and have that pass through the 
parameter to ActivityManager::releaseAccess, modified to only only call 
dispatchNext() when the flag is true. This exitCurrentThread would then release 
the kernel lock but not immediately try to dispatch another thread. The newly 
added call to ActivityManager::releaseAccess (the “fix”) would have the usual 
call to release (if still held) the kernel lock AND call dispatchNext, ensuring 
it always gets called exactly once.

 

Below is a summary of the changes. Lines with -+ are changed lines. If this 
seems like a good idea let me know and I’ll get a fresh SVN tree to make the 
changes in so that I can provide a proper diff file. Likewise if there is any 
critique or suggestion for modification I’m all ears!

 

In Activity.hpp

-+    void        releaseAccess(bool dispatch = true);

 

In Activity.cpp

 

void Activity::exitCurrentThread(bool dispatch)

{

….

-+    releaseAccess(dispatch);

}

 

void Activity::releaseAccess(bool dispatch)

{

…

-+        ActivityManager::releaseAccess(dispatch);

    }

}

 

In ActivityManager.hpp

 

-+    static void releaseAccess(bool dispatch = true);

 

In ActivityManager.cpp

 

void ActivityManager::releaseAccess(bool dispatch)

{

…

++    if (dispatch)

++    {

      dispatchNext();

++    }

}

 

In InterpreterInstance.cpp

 

bool InterpreterInstance::terminate()

{

…

 

-+        current->exitCurrentThread(false);

…

-+        current->exitCurrentThread(false);

…

 

++    ActivityManager::releaseAccess(); // the ‘fix’

 

Kind Regards,

Dom

 

 

From: Rony G. Flatscher < <mailto:[email protected]> 
[email protected]> 
Sent: 23 August 2025 17:13
To:  <mailto:[email protected]> 
[email protected]
Subject: [Oorexx-devel] No, only the hang bug left and that can be solved with 
Dom Wise's patch (Re: Maybe two different bugs? A question ad fixing the hang

 

On 22.08.2025 21:58, Rony G. Flatscher wrote:

On 22.08.2025 13:31, Rony G. Flatscher wrote:

Having tested the interpreter without Dom's patch with the samples fxml_25, 
fxml_26, fxml_27, and fxml_99 ("BSF4ooRexx850\samples\JavaFX") the following 
could be observed using different versions of Java (Java 8 and in the end Java 
24):

1.      the hang of fxml_99 occurs on all Java versions
2.      fxml_25, fxml_26, and fxml_27 seem to work on Java 8

1.      fxml_26 and fxml_27 crash on Java 24

This leads me to believe that there might be two different problems here. 
Fixing the hang somehow causes the crashes of fxml_26 and fxml_27 (and then 
sometimes of fxml_25) on both, Java 8 and Java 24, so they seem to be uncovered 
earlier. 

... cut ...

One important note: here "crashes" should be rephrased to "Java NullPointer 
exceptions", these are *not* crashes of the process! Will runtime debug 
BSF.CLS, JNI and the Java side of the bridge once more, but this may take some 
time as glimpses of private life take precedence this weekend ...  ;)

After going through the BSF.CLS, BSF4ooRexx.cc/JNI, Java bridge programs 
(creating MBs of debug data) I found out that the version of BSF4ooRexx850 I 
have been working got tampered by myself! :-( After inspecting my changes 
(basically debug statements) I found the culprit (it was an error in the Java 
reflection part of Java constructors).

Tested fxml_20, fxml_25, fxml_26, fxml_27, fxml_99 on Java 8, Java 17 and Java 
24. They work thankfully! :)

---

The hang bug (fxml_99) can be fixed with Dom Wise's patch. 

So far no feedback to the contrary has been given and the test to apply it a 
hundred times when terminating an interpreter instance without any side effects 
(which would be surprising seeing the code that gets executed), such that I 
would like to apply his patch (unlocking the kernel by invoking 
"ActivityManager::releaseAccess()" right before invoking 
"Intepreter::terminateInterpreterInstance(this)) to trunk:

Index: interpreter/runtime/InterpreterInstance.cpp
===================================================================
--- interpreter/runtime/InterpreterInstance.cpp (revision 13006)
+++ interpreter/runtime/InterpreterInstance.cpp (working copy)
@@ -559,6 +559,7 @@
     commandHandlers = OREF_NULL;
     requiresFiles = OREF_NULL;
 
+ActivityManager::releaseAccess();
 
     // tell the main interpreter controller we're gone.
     Interpreter::terminateInterpreterInstance(this);

Any objections?

With fixing the hang bug there are no open show-stopper bugs in ooRexx anymore!

---rony

 

Attachment: rebalance-terminate-kernel-locks.diff
Description: Binary data

_______________________________________________
Oorexx-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Reply via email to