Re: [Oorexx-devel] Some more information, and a random run (Re: Question ad a hang situation

dominicjwise Sun, 17 Aug 2025 16:31:23 -0700

Dear Rony,


I think there may actually only be one “cause” for the multiple lock acquires 
after all. Some additional errors were seen when I tried to fix the one below 
(with some attempts at lock balancing logic) and this suggested a second issue 
but it turns out I was initially using the wrong unlock function and switching 
to the one I submitted was all I needed to do.

 

My understanding is as follows (all methods of InterpreterInstance unless 
specified)

 

*       terminate() calls enterOnCurrentThread() to get a valid Activity for 
running any Uninit methods
*       enterOnCurrentThread() calls attachThread(*noargs*)
*       attachThread(*noargs*) checks  for an existing usable activity with 
findActvity(). If it finds one it returns it and there is no change to the 
kernel lock count at this point. If not it creates via a new one with 
ActivityManager::attachThread() which acquires but  does not release the kernel 
lock 
*       On returning to enterOnCurrentThread(), Activity::requestApiAccess() is 
called which either immediately acquires the kernel lock for the returned 
activity if it is available or puts the activity into a wait queue. This is on 
top of any earlier acquire in the bold code

 

     If the bold code is executed then the calling thread will end up holding 
the kernel lock after terminate() completes.

 

I’m not sure exactly what additional checks need to be carried out and where an 
extra release needs to go to ensure all functions in this call chain work 
correctly under all circumstances and always  only unlock as many times as were 
locked, hence the “hack”

 

I hope this helps to clarify the issue further, and of course if there are any 
glaring errors in my analysis I’ll only learn if someone (gently I hope!) 
points them out to me 😊

 

 

Kind Regards,

Dom

 

 

 

From: Rony G. Flatscher <rony.flatsc...@wu.ac.at> 
Sent: 17 August 2025 20:12
To: oorexx-devel@lists.sourceforge.net
Subject: Re: [Oorexx-devel] Some more information, and a random run (Re: 
Question ad a hang situation

 

Dear Dom,

On 17.08.2025 17:31, dominicjw...@gmail.com <mailto:dominicjw...@gmail.com>  
wrote:

I am one such masochist that Rony was referring to 😊

Very glad that you are one! ;)



It seems that in InterpreterInstance::terminate() there can be situations where 
the kernel lock ends up being taken twice. It is only ever released once and 
this means that when it has been locked twice the thread leaves the function 
owning the kernel lock and can go on happily locking and releasing it but any 
other thread trying to access it is immediately and forever locked.
 
I spent quite some time trying to figure out and code for the individual cases 
(I think there may be at least two) to ensure the acquires and releases are 
always balanced but in the end I gave up and as a hack added an extra call to 
ActivityManager::releaseAccess() at the end of this function. This seems to 
have done the trick and the program runs ok with this in place.

This is *great*, thank you very much!

Could you briefly describe the two individual cases, if you have them still 
handy?

The reasons for the additional kernel locks may be easy to identify or they may 
a symptom of problems elsewhere that could be very subtle or nasty but 
hopefully this provides a starting point

Given that your hack removes the hang, and given that the testsuite runs to 
completion on my Windows machine, I would like to apply that hack to trunk to 
see, how it does on all Jenkin hosted operating systems. 

Here the patch I have been using and testing (yes it fixes the hang, *very* 
appreciated!!):

Index: interpreter/runtime/InterpreterInstance.cpp
===================================================================
--- interpreter/runtime/InterpreterInstance.cpp (revision 13004)
+++ interpreter/runtime/InterpreterInstance.cpp (working copy)
@@ -299,6 +299,7 @@
     {
         terminationSem.post();
     }
+    ActivityManager::releaseAccess();   // hack kudos to Dom Wise (20250817 
e-mail on developer list), remove kernel lock again
     return true;
 }

Of course, once a fix can be found, this patch should be replaced. 

First I will create an entry in the bug tracker and then commit that patch and 
record it against the bug.

Dom, again *many* thanks for your efforts and sharing your findings, *very* 
much appreciated!

Best regards

---rony 





-----Original Message-----
From: Rony G. Flatscher  <mailto:rony.flatsc...@wu.ac.at> 
<rony.flatsc...@wu.ac.at> 
Sent: 17 August 2025 09:30
To: oorexx-devel@lists.sourceforge.net 
<mailto:oorexx-devel@lists.sourceforge.net> 
Subject: Re: [Oorexx-devel] Some more information, and a random run (Re: 
Question ad a hang situation
 
Hi Michael,
 
On 17.08.2025 03:47, Michael Lueck wrote:

W O W ! ! ! Bravo!!! A most impressive success milestone!

 
but not solved yet. It looks as it is somehow linked to terminating interpreter 
instances right before the callback from Java into ooRexx. The change I made 
was a debug output statement from the Java garbage collector run (collecting 
the peer Java RexxEngines) right before the ooRexx instance gets terminated 
from the Java side, as if this little time span makes the difference (at least 
on my
machine) and allows the Java callback into ooRexx to not block.
 
---rony

_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Re: [Oorexx-devel] Some more information, and a random run (Re: Question ad a hang situation

Reply via email to