Re: [Oorexx-devel] Musings with tracing multithreaded ooRexx programs, mt91.rex: on two Rexx interpreter instances (RII)

Mike Cowlishaw Sun, 26 Feb 2023 08:49:18 -0800

OK, also needed for the weekend to spend time on this.
 
I think I agree that just adding the thread ID is not sufficient for every
case (although I think it would help greatly in many cases).
 
Trace is part of the language, however, and adding to it in any way needs
much care and attention: the formatting (widths, etc.) is as important as
any other detail of the language.  The trace format took years to evolve,
and is wasn't until lots of people were using it that it became clear how to
make it both simple and useful.
 
Like Chip (I suspect), I'm beginning to think that this is outside the
domain of the language and more suited to some external 'specialised tool'
such as an IDE (integrated development environment), or just an option to
write the trace to a file .. that is, not expected to be seen by humans but
more likely processed by a program to be presented in some useful way to the
human.
 
Mike

  _____  

From: Rony G. Flatscher [mailto:[email protected]] 
Sent: 22 February 2023 17:43
To: [email protected]
Subject: Re: [Oorexx-devel] Musings with tracing multithreaded ooRexx
programs, mt91.rex: on two Rexx interpreter instances (RII)

Mike, sorry for the long intermission, having been "under water" for
different reasons and I wanted to take my time to get back as becoming able
to fully understand and to debug multi threaded ooRexx programs is really a
very important and needed task.

On 17.02.2023 16:00, Mike Cowlishaw wrote:

Rony, I fully understand the need for thread information.   As I mentioned
in an earlier post I often add this to (manually-added) tracing in my C
programs -- which are true multi-core hardware threads.   I also suggested a
way that thread IDs could be added to the existing trace output (if
requested) with minimal modification of code and documentation.

Yes, and I picked up your suggestion and used it to demonstrate various
forms of trace output without and with it using a simple ooRexx
multithreaded test case.

If you look up the program "mt91.rex" and the trace without and with your
suggested thread id you will see, that supplying the thread id in that case
is not sufficient to understand how the multithreaded parts of the ooRexx
program get executed.

Also, if one repeats running that multithreaded ooRexx program the sequence
of the executed ooRexx statements may be different and differently
interleaved. 

What I don't understand is the need to add lots of other options, wide
spacing, and fixed-format layout which simply won't work when the numbers
are more than one or two digits.  And it seems to me that only the author of
this design might guess that "1*" refers to thread locks ... :-).

Probably I did a lousy job explaining the problems that may occur when
multithreaded ooRexx programs get executed:

*       repeating execution of multithreaded ooRexx programs do not
necessarily repeat the same sequence of traces 

*       the execution of a section of ooRexx code may be interleaved with
the execution of another section of the ooRexx code differently from run to
run 

*       this poses the principal problem - comparing with classic Rexx
programs - of understanding what happens and why 

*       having the thread id available in the trace output may help
understand the execution in relative simple cases of multithreading like in
the mt01.rex example, but not so for more complex ones like mt91.rex,
unfortunately 

*       there is trace information missing about the execution context that
is needed to become able to understand what really happens where

*       if you look up the trace according to your suggestion to add the
thread information using the thread id "1" of mt91.rex you will not be able
to understand what happens there; the reason being that in thread id "1"
there are different activations (invocations) taking place, and worse these
activations (invocations) belong to two totally different program runs, to
two different Rexx instances, information that is missing from the trace

*       if however you look up the trace of mt91.rex with the suggested
extended trace information you can all of a sudden distinguish the
activations (invocations) that occur on e.g. T1 and can see that they are
different from each other; as the same program gets run concurrently on two
different Rexx instances you get to see which Rexx instance executes which
activation (invocation) on which thread and therefore you can distinguish
them from each other even on thread T1 and become able to understand which
executions belong together and which ones don't; having only the thread
number does not allow you to do that at all in this case!

So the examples so far have the purpose to demonstrate multi threaed ooRexx
programs and the traces that get caused by them, without concurrent trace
information, with the thread id as per your suggestion and the suggested
extended (concurrency) trace output. 

mt91.rex should have made it clear that supplying only the thread id as part
of the trace output is unfortunately not sufficient to become able to
understand what is going on where and why. This example is intended to allow
understanding why additional execution related information that is
concurrency  dependent becomes necessary as otherwise one cannot (easily or
not at all) understand what is going on in your own ooRexx program!

---

Ad variable pools, lock counts with the activations (invocations) possessing
the lock: the sample programs I supplied have not (yet) demonstrated the
usefulness of these three pieces of information. They become helpful the
moment you hit e.g. deadlocks and have a need to find out which activation
(invocation) gets locked out because of which other activations
(invocations) where locks get added and where removed. 

At this moment, I am afraid, it probably does not really make sense to come
up with a more complex example that demonstrates that, as long as it is not
understood in the first place why it is really necessary to get more than
the thread id shown in the trace output in order to become able to
understand and to debug multithreaded ooRexx programs as mt91.rex was
intended to demonstrate.

---

In the end the point is: ooRexx - unlike classic Rexx - is by design a very
flexible and multithreaded programming language. The trace keyword
instruction in ooRexx has not been updated to support understanding and
debugging such mulithreaded ooRexx programs. 

Adding the thread id by default like you suggest is a first step that allows
for understanding and debugging simple multithreaed ooRexx programs. Also,
it allows understanding and debugging classic Rexx programs that get
dispatched on different threads by a hosting program that runs Rexx programs
on different of its threads. 

However, adding the thread id is not sufficient to understand and to debug
powerful/complex multithreaded ooRexx programs. ooRexx programs can be
executed in a multithreaed manner by different means: here some of the
ooRexx-related means that play a role in a multithreaded ooRexx world:

*       the REPLY keyword statement (returns and activation/invocation gets
carried on on a different thread) 

*       the START message (root class .Object or class .Message) 

*       methods marked as GUARDED or UNGUARDED (lock related)

*       the GUARD keyword statement (lock related) 

*       Rexx instances executing ooRexx programs in a multithreaded context
(dispatched by host applications)

---

Of course any other form, idea to supply the necessary information to help
understand and debug multithreaded ooRexx programs/applications is welcome
and appreciated! However, so far only supplying the thread id in the trace
message was suggested which is not enough information, unfortunately.

---

Ad format: of course one can use less widths but at this point in time is
maybe not as important as understanding/willing to supply the necessary
additional information that enables, empowers one to fully understand and
debug multi-threaded ooRexx programs. 

Please do not misunderstand me: I also think that defining the format is
very important to help/ease programmers understand the output. So a
discussion about the format, sequence, abbreviations is welcome of course.
The more improvements the better.

---

Context information like R (Rexx instance), T (thread number), A
(activation/invocation), V (variable pool), lock-counts with asterisk (*)
indicating who owns the lock, are really important for becoming able to
understand and to debug multithreaded ooRexx programs. It is o.k. to
challenge each such information.

And again: if there are ideas about alternative ways enabling ooRexx
programmers to quickly understand and to debug multithreaded ooRexx
programs, they are welcome. However, so far no such suggestion, idea has
been communicated. And as the need for such information is a reality (in my
case urgently vis-à-vis my students who start as beginners and at the end of
the semester employ multithreaded ooRexx programs with the need to
understand and to debug them as easy as possible) I would very much make it
available ASAP. However, I am interested in a constructive procedure, if
possible.

---rony

  _____  

From: Rony [mailto:[email protected]] 
Sent: 16 February 2023 21:43
To: Open Object Rexx Developer Mailing List
Subject: Re: [Oorexx-devel] Musings with tracing multithreaded ooRexx
programs, mt91.rex: on two Rexx interpreter instances (RII)

Am 15.02.2023 um 18:57 schrieb Mike Cowlishaw  <mailto:[email protected]>
<[email protected]>:

As for the 'spaced out' case (excerpt below) ... this really would not work
for me.   I often have 5-9 windows open when I'm programming and these are
80 characters wide so I can minimise overlaps.  With the suggested layout
this would only work for programs less than ~40 characters wide!   Here's
how the excerpt looks for me (and this example has very short lines -- most
of my programs use 72 or more characters per line for better commentary):

---> mt91.rex_nr_1_via_JSR223
R1   T1   A1                    3 *-* t=.Test~new
R1   T1   A2    V1      1*     21 *-* say "arrived in:" .context~name
arrived in: INIT
R1   T1   A2    V1      1*     22 *-* counter=0
R1   T1   A1                      >>>   "a TEST"
R1   T1   A1                    4 *-* t~m1
R1   T1   A3    V1      1*     27 *-* counter+=1          -- increase
counter
R1   T1   A3    V1      1*     28 *-* say "arrived in:" .context~name
"before reply"

Almost any line of any length will wrap.  That's why the trace headers in
Rexx are kept as short as feasible.  

Yes trace has been well thought out and well designed.

It seems that you are under the impression that this extra trace information
should get added to trace by default? If so, that is not the case. In effect
as designed and communicated (maybe badly), you need to activate this extra
trace information explicitly which you would do only, if you have an
important reason: needing additional information in order to become able to
debug multithreaded programs. You would so only, if it helps, if it benefits
the programmer in understanding and debugging. Otherwise the option would
not be used.

You would probably not be a candidate needing this extra information
(assuming that you are not employing ooRexx multithreading) so you would
not activate it and everything would remain the same for your specific
configured working environment. Nothing changes for you.

Those who are in need will be more than happy to get these extra information
to save a lot of time and becoming able to analyze and to understand
multithreaded problems. Without this extra, multithreaded related trace
information it would not be possible the more complex, the more interleaved
multithreaded execution takes place. 

Just look at the trace, trace with thread number and with
extended/mulithreading-related trace e.g. with the mt91.rex example in my
earlier post: without the multithreaded/extended trace no one could
understand what really happens there. This is the core of the problem: trace
has not been extended to supply this necessary multithreaded/extended
information in case multithreading causes problems and the programmer
requests the multithreaded-related/extra information with trace.

Adding an unexplained 27 characters on the front of each line makes little
sense, 

Unexplained, hence surprising?

Seriously, this extended trace information would get created only if the
ooRexx programmer/user requests it explicitly. To be able to request the
multithreaded trace one needs to be aware a) it exists and b) the option to
get it activated. 

This is only possible if the programmer/user reads the documentation which
will include the explanation of what the extra trace information is about:
R is the Rexx instance, T is the thread number, A is the activation
(invocation), V is the variable pool, the number column shows the number
of requested locks, the asterisk the owner of the lock (that may cause
deadlocks for others). The mnemonics make it easy to remember. Everyone who
read that or got the explanation for the letters will be able understand
what is being presented. 

So, they get explained and everyone who read the documentation would
understand. Also, it is easy to explain (hence also easy to understand), but
only if this information is made available.

especially as the information is the same on most lines, 

The information repeats in some columns in these traces. The purpose of this
is to become able to spot and analyze immediately lines that deviate, e.g.
the Rexx instance number changes, the thread number changes, the activation
number changes etc. which is the case when concurrently other parts get
executed, interleaving with the trace lines that otherwise have a few
columns in common. Without the multithreaded/extra information one is not
able to notice that at all! 

This is exactly the problem in multithreaded programs, the more concurrency
the more important to become able to notice that in order to become able to
analyze and for that reason you need to carry these information!

and as I mentioned before is not user-friendly (here I mean 'user' as being
a writer of Rexx programs, not someone who runs a Rexx program without
looking at it or caring which language it is written in). 

This is where we differ: leaving that away to make it user-friendly by
your definition (keep what you see as redundancy out, saving space to keep
the lines within 80 columns etc.) makes it impossible for the user to find
out how his program gets executed in a multithreaded environment
(application hosted and/or self induced) and in case of a problem to become
able to find the cause. The user is rendered helpless in such a situation.

Of course you are right that this extended information is not necessary for
non-concurrent Rexx programs, which classic Rexx programs are by nature (not
having the means in the language to kick off multithreading). Therefore it
goes without saying that the multithreaded/extended trace needs to be off by
default. A user will activate multithreading trace only if he gets
additional helpful information for his program and problem.

As I  teach every semester newcomers to ooRexx and can observe where and why
students get into problems and loose (sometimes literally weeks) because of
problems in multithreaded environments because of lack of related trace
information supplied by ooRexx. This is typically in GUI environments
(awt/swing, JavaFX), something that is very important to master in todays
world. They would become able to see and analyze how their Rexx programs
execute in such an environment. Currently, they can only try to code SAY
debug statements scattered over their code and still will not have
information like which activation owns at a certain point in the execution
thread the lock and which ones are blocked etc. (The available multithreaded
trace supplies exactly these insights which would help them immediately and
a lot.)

To sum up: the multithreaded trace is not meant to be active by default. A
programmer/user needs to activate it and if he is doing so, there must be a
good reason as the produced extra debug information may be overwhelming (but
extremely helpful and time-saving). 

Any Rexx programmer/user that does not employ multithreading will never
activate that option and everything remains untouched and the way it has
been for the last 40 years.

-

If you have alternative ideas about how one could come up with the same
information for multithreaded ooRexx programs in a manner, that you see more
user friendly and carries comparable explanation and problem solving power
then I really would be interested in it. The main use case is a deadlock due
to multithreading issues in some ooRexx program(s).

-rony

_______________________________________________
Oorexx-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Re: [Oorexx-devel] Musings with tracing multithreaded ooRexx programs, mt91.rex: on two Rexx interpreter instances (RII)

Reply via email to