Yes, impressive, indeed, thank you, René.
However there is one important piece of information that is missing: that application did work a
couple of years ago, and sometimes works, mostly on Linux and macOS, if it does. Therefore I think
that in principle everything is set out correctly, but that a situation arises that causes that
hang. Having spent quite some time with that area of the interpreter I was hoping to get some
hints, ideas, theories what could be a possible reason for it. Granted, this is an optimistic
request, but hey, if one does not try one would not get a "lucky punch" hint. If there are no ideas,
then I need to systematically go through the code which may take a lot of time and effort.
---rony
On 13.08.2025 16:08, Gilbert Barmwater via Oorexx-devel wrote:
WOW! Unbelievable that AI could do that, at least to me. If most of that is, in fact, meaningful
- and I have no way of knowing if it is or isn't, way over my head - this is a significant
addition to the ability to debug complex code problems. I have my fingers crossed that this will
help Rony find his problem because I want to believe in this approach. Thanks for sharing René!
Gil
On 8/13/2025 9:53 AM, René Jansen via Oorexx-devel wrote:
I asked my buddy AI for you:
Short version: almost everything here is *blocked, waiting on kernel objects/events*. One thread
(the one with |rexx.dll| in the stack) is trying to *attach to ooRexx* via BSF4ooRexx while the
JVM is already involved, and it’s waiting for the *ooRexx kernel mutex*. Meanwhile several JVM
worker threads are also parked in waits. This pattern screams *lock-order inversion / deadlock
between Java ↔ ooRexx* (likely “call into Rexx while holding something, which calls back into
Java, which tries to attach back into Rexx and blocks on the Rexx global lock”).
What the stacks say
*
Repeated tops of stack:
|ntdll!NtWaitForSingleObject → KernelBase!WaitForSingleObjectEx →
jvm.dll!...|
That’s a *parked/waiting thread* (monitor/condition/OS event); not runnable.
*
The interesting one (Not Flagged, tid |> 23728|):
|win32u!NtUserMsgWaitForMultipleObjectsEx → user32!RealMsgWait… →
rexx.dll!waitHandle →
SysMutex::request → ActivityManager::lockKernel → Activity::waitForKernel →
ActivityManager::addWaitingActivity → Activity::requestAccess →
Activity::nestAttach →
InterpreterInstance::attachThread → AttachThread → BSF4ooRexx850.dll …|
This shows a *BSF/ooRexx attach* trying to acquire the *Rexx kernel lock*
and
*waiting* (message-wait variant, so it can pump messages).
*
Many other JVM threads show the same wait pattern at different internal pcs
(|jvm.dll!0x7117e75a|, |…e82f|, etc.). That’s consistent with *Java threads
parked on
monitors/conditions* (e.g., GC, JIT, RMI, pool workers) while some other
thread is expected
to make progress—but isn’t.
Likely scenario
1.
A thread entered *ooRexx* and still *holds the Rexx kernel mutex* (ooRexx is
single-kernel-locked).
2.
During that work, it *called into Java* (BSF).
3.
Another thread (or a callback on the same thread through message pumping)
is now trying to
*AttachThread / nestAttach* back into ooRexx → it blocks on
|SysMutex::request()| because the
kernel lock is still held.
4.
Because of that, Java side work that would release the original path can’t
continue →
*deadlock* (or at least a hard stall).
How to confirm quickly
*
*Grab a full dump* and in WinDbg run:
o
|!locks| (who owns the critical section / mutex in |rexx.dll|).
o
|~* kb| to see if any thread is inside |rexx.dll| /not/ waiting (the
probable owner).
*
From the Java side, run |jcmd <pid> Thread.print| (or |jstack|) and look
for:
o
Threads in |parking to wait for <…>| with *owning thread* info.
o
Any thread doing a *native call into BSF/ooRexx*.
*
In Process Explorer’s Threads tab, the *Waiting reason* for the
Rexx-attaching thread will be
a wait on an event/mutex handle; the *owner* (if resolvable) points to the
holder.
Practical fixes / mitigations
*
*Never call back into Rexx while holding the Rexx kernel lock.* In native
ooRexx extensions
this usually means:
o
Minimize the critical section; *release the kernel* before making
Java/BSF calls that can
reenter.
o
If you must call out, *queue* work (post to another thread) instead of
re-entering
synchronously.
*
For *BSF4ooRexx*:
o
Ensure every |AttachThread| is matched with |DetachThread| in a
|try/finally|.
o
Avoid *nested attaches* (|nestAttach| shows on your stack). If you’re
already attached,
reuse the context; don’t attach again.
o
If Java code calls into Rexx, and Rexx code calls back into Java,
enforce a single
direction at a time (e.g., schedule the callback asynchronously).
*
Add diagnostics:
o
On the Rexx side, enable verbose/trace around *kernel lock* operations
if available; log
when you enter/leave Rexx from Java and vice versa.
o
On the Java side, log before/after *BSF calls* and include current
thread name + whether
already attached.
If it’s not a full deadlock (just a stall)
It can still be *head-of-line blocking*: one long-running Rexx activity holds the kernel, and
many threads pile up on |requestAccess()|. The cure is the same—*shorten the locked region* or
make the long task cooperative (yield/release).
If you paste the owner of the Rexx mutex from |!locks| (or a |jstack| snippet showing the thread
doing the call into Rexx while others block), I can point at the exact offender and the safest
place to release the lock.
best regards,
René.
_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel
--
Gil Barmwater
_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel
--
--
__________________________________________________________________________________
Prof. Dr. Rony G. Flatscher, iR
Department Wirtschaftsinformatik und Operations Management
WU Wien
Welthandelsplatz 1
A-1020 Wien/Vienna, Austria/Europe
http://www.wu.ac.at
__________________________________________________________________________________
_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel