OK, also needed for the weekend to spend time on this. I think I agree that just adding the thread ID is not sufficient for every case (although I think it would help greatly in many cases). Trace is part of the language, however, and adding to it in any way needs much care and attention: the formatting (widths, etc.) is as important as any other detail of the language. The trace format took years to evolve, and is wasn't until lots of people were using it that it became clear how to make it both simple and useful. Like Chip (I suspect), I'm beginning to think that this is outside the domain of the language and more suited to some external 'specialised tool' such as an IDE (integrated development environment), or just an option to write the trace to a file .. that is, not expected to be seen by humans but more likely processed by a program to be presented in some useful way to the human. Mike
_____ From: Rony G. Flatscher [mailto:rony.flatsc...@wu.ac.at] Sent: 22 February 2023 17:43 To: oorexx-devel@lists.sourceforge.net Subject: Re: [Oorexx-devel] Musings with tracing multithreaded ooRexx programs, mt91.rex: on two Rexx interpreter instances (RII) Mike, sorry for the long intermission, having been "under water" for different reasons and I wanted to take my time to get back as becoming able to fully understand and to debug multi threaded ooRexx programs is really a very important and needed task. On 17.02.2023 16:00, Mike Cowlishaw wrote: Rony, I fully understand the need for thread information. As I mentioned in an earlier post I often add this to (manually-added) tracing in my C programs -- which are true multi-core hardware threads. I also suggested a way that thread IDs could be added to the existing trace output (if requested) with minimal modification of code and documentation. Yes, and I picked up your suggestion and used it to demonstrate various forms of trace output without and with it using a simple ooRexx multithreaded test case. If you look up the program "mt91.rex" and the trace without and with your suggested thread id you will see, that supplying the thread id in that case is not sufficient to understand how the multithreaded parts of the ooRexx program get executed. Also, if one repeats running that multithreaded ooRexx program the sequence of the executed ooRexx statements may be different and differently interleaved. What I don't understand is the need to add lots of other options, wide spacing, and fixed-format layout which simply won't work when the numbers are more than one or two digits. And it seems to me that only the author of this design might guess that "1*" refers to thread locks ... :-). Probably I did a lousy job explaining the problems that may occur when multithreaded ooRexx programs get executed: * repeating execution of multithreaded ooRexx programs do not necessarily repeat the same sequence of traces * the execution of a section of ooRexx code may be interleaved with the execution of another section of the ooRexx code differently from run to run * this poses the principal problem - comparing with classic Rexx programs - of understanding what happens and why * having the thread id available in the trace output may help understand the execution in relative simple cases of multithreading like in the mt01.rex example, but not so for more complex ones like mt91.rex, unfortunately * there is trace information missing about the execution context that is needed to become able to understand what really happens where * if you look up the trace according to your suggestion to add the thread information using the thread id "1" of mt91.rex you will not be able to understand what happens there; the reason being that in thread id "1" there are different activations (invocations) taking place, and worse these activations (invocations) belong to two totally different program runs, to two different Rexx instances, information that is missing from the trace * if however you look up the trace of mt91.rex with the suggested extended trace information you can all of a sudden distinguish the activations (invocations) that occur on e.g. T1 and can see that they are different from each other; as the same program gets run concurrently on two different Rexx instances you get to see which Rexx instance executes which activation (invocation) on which thread and therefore you can distinguish them from each other even on thread T1 and become able to understand which executions belong together and which ones don't; having only the thread number does not allow you to do that at all in this case! So the examples so far have the purpose to demonstrate multi threaed ooRexx programs and the traces that get caused by them, without concurrent trace information, with the thread id as per your suggestion and the suggested extended (concurrency) trace output. mt91.rex should have made it clear that supplying only the thread id as part of the trace output is unfortunately not sufficient to become able to understand what is going on where and why. This example is intended to allow understanding why additional execution related information that is concurrency dependent becomes necessary as otherwise one cannot (easily or not at all) understand what is going on in your own ooRexx program! --- Ad variable pools, lock counts with the activations (invocations) possessing the lock: the sample programs I supplied have not (yet) demonstrated the usefulness of these three pieces of information. They become helpful the moment you hit e.g. deadlocks and have a need to find out which activation (invocation) gets locked out because of which other activations (invocations) where locks get added and where removed. At this moment, I am afraid, it probably does not really make sense to come up with a more complex example that demonstrates that, as long as it is not understood in the first place why it is really necessary to get more than the thread id shown in the trace output in order to become able to understand and to debug multithreaded ooRexx programs as mt91.rex was intended to demonstrate. --- In the end the point is: ooRexx - unlike classic Rexx - is by design a very flexible and multithreaded programming language. The trace keyword instruction in ooRexx has not been updated to support understanding and debugging such mulithreaded ooRexx programs. Adding the thread id by default like you suggest is a first step that allows for understanding and debugging simple multithreaed ooRexx programs. Also, it allows understanding and debugging classic Rexx programs that get dispatched on different threads by a hosting program that runs Rexx programs on different of its threads. However, adding the thread id is not sufficient to understand and to debug powerful/complex multithreaded ooRexx programs. ooRexx programs can be executed in a multithreaed manner by different means: here some of the ooRexx-related means that play a role in a multithreaded ooRexx world: * the REPLY keyword statement (returns and activation/invocation gets carried on on a different thread) * the START message (root class .Object or class .Message) * methods marked as GUARDED or UNGUARDED (lock related) * the GUARD keyword statement (lock related) * Rexx instances executing ooRexx programs in a multithreaded context (dispatched by host applications) --- Of course any other form, idea to supply the necessary information to help understand and debug multithreaded ooRexx programs/applications is welcome and appreciated! However, so far only supplying the thread id in the trace message was suggested which is not enough information, unfortunately. --- Ad format: of course one can use less widths but at this point in time is maybe not as important as understanding/willing to supply the necessary additional information that enables, empowers one to fully understand and debug multi-threaded ooRexx programs. Please do not misunderstand me: I also think that defining the format is very important to help/ease programmers understand the output. So a discussion about the format, sequence, abbreviations is welcome of course. The more improvements the better. --- Context information like R (Rexx instance), T (thread number), A (activation/invocation), V (variable pool), lock-counts with asterisk (*) indicating who owns the lock, are really important for becoming able to understand and to debug multithreaded ooRexx programs. It is o.k. to challenge each such information. And again: if there are ideas about alternative ways enabling ooRexx programmers to quickly understand and to debug multithreaded ooRexx programs, they are welcome. However, so far no such suggestion, idea has been communicated. And as the need for such information is a reality (in my case urgently vis-à-vis my students who start as beginners and at the end of the semester employ multithreaded ooRexx programs with the need to understand and to debug them as easy as possible) I would very much make it available ASAP. However, I am interested in a constructive procedure, if possible. ---rony _____ From: Rony [mailto:rony.flatsc...@wu.ac.at] Sent: 16 February 2023 21:43 To: Open Object Rexx Developer Mailing List Subject: Re: [Oorexx-devel] Musings with tracing multithreaded ooRexx programs, mt91.rex: on two Rexx interpreter instances (RII) Am 15.02.2023 um 18:57 schrieb Mike Cowlishaw <mailto:m...@speleotrove.com> <m...@speleotrove.com>: As for the 'spaced out' case (excerpt below) ... this really would not work for me. I often have 5-9 windows open when I'm programming and these are 80 characters wide so I can minimise overlaps. With the suggested layout this would only work for programs less than ~40 characters wide! Here's how the excerpt looks for me (and this example has very short lines -- most of my programs use 72 or more characters per line for better commentary): ---> mt91.rex_nr_1_via_JSR223 R1 T1 A1 3 *-* t=.Test~new R1 T1 A2 V1 1* 21 *-* say "arrived in:" .context~name arrived in: INIT R1 T1 A2 V1 1* 22 *-* counter=0 R1 T1 A1 >>> "a TEST" R1 T1 A1 4 *-* t~m1 R1 T1 A3 V1 1* 27 *-* counter+=1 -- increase counter R1 T1 A3 V1 1* 28 *-* say "arrived in:" .context~name "before reply" Almost any line of any length will wrap. That's why the trace headers in Rexx are kept as short as feasible. Yes trace has been well thought out and well designed. It seems that you are under the impression that this extra trace information should get added to trace by default? If so, that is not the case. In effect as designed and communicated (maybe badly), you need to activate this extra trace information explicitly which you would do only, if you have an important reason: needing additional information in order to become able to debug multithreaded programs. You would so only, if it helps, if it benefits the programmer in understanding and debugging. Otherwise the option would not be used. You would probably not be a candidate needing this extra information (assuming that you are not employing ooRexx multithreading) so you would not activate it and everything would remain the same for your specific configured working environment. Nothing changes for you. Those who are in need will be more than happy to get these extra information to save a lot of time and becoming able to analyze and to understand multithreaded problems. Without this extra, multithreaded related trace information it would not be possible the more complex, the more interleaved multithreaded execution takes place. Just look at the trace, trace with thread number and with extended/mulithreading-related trace e.g. with the mt91.rex example in my earlier post: without the multithreaded/extended trace no one could understand what really happens there. This is the core of the problem: trace has not been extended to supply this necessary multithreaded/extended information in case multithreading causes problems and the programmer requests the multithreaded-related/extra information with trace. Adding an unexplained 27 characters on the front of each line makes little sense, Unexplained, hence surprising? Seriously, this extended trace information would get created only if the ooRexx programmer/user requests it explicitly. To be able to request the multithreaded trace one needs to be aware a) it exists and b) the option to get it activated. This is only possible if the programmer/user reads the documentation which will include the explanation of what the extra trace information is about: R is the Rexx instance, T is the thread number, A is the activation (invocation), V is the variable pool, the number column shows the number of requested locks, the asterisk the owner of the lock (that may cause deadlocks for others). The mnemonics make it easy to remember. Everyone who read that or got the explanation for the letters will be able understand what is being presented. So, they get explained and everyone who read the documentation would understand. Also, it is easy to explain (hence also easy to understand), but only if this information is made available. especially as the information is the same on most lines, The information repeats in some columns in these traces. The purpose of this is to become able to spot and analyze immediately lines that deviate, e.g. the Rexx instance number changes, the thread number changes, the activation number changes etc. which is the case when concurrently other parts get executed, interleaving with the trace lines that otherwise have a few columns in common. Without the multithreaded/extra information one is not able to notice that at all! This is exactly the problem in multithreaded programs, the more concurrency the more important to become able to notice that in order to become able to analyze and for that reason you need to carry these information! and as I mentioned before is not user-friendly (here I mean 'user' as being a writer of Rexx programs, not someone who runs a Rexx program without looking at it or caring which language it is written in). This is where we differ: leaving that away to make it user-friendly by your definition (keep what you see as redundancy out, saving space to keep the lines within 80 columns etc.) makes it impossible for the user to find out how his program gets executed in a multithreaded environment (application hosted and/or self induced) and in case of a problem to become able to find the cause. The user is rendered helpless in such a situation. Of course you are right that this extended information is not necessary for non-concurrent Rexx programs, which classic Rexx programs are by nature (not having the means in the language to kick off multithreading). Therefore it goes without saying that the multithreaded/extended trace needs to be off by default. A user will activate multithreading trace only if he gets additional helpful information for his program and problem. As I teach every semester newcomers to ooRexx and can observe where and why students get into problems and loose (sometimes literally weeks) because of problems in multithreaded environments because of lack of related trace information supplied by ooRexx. This is typically in GUI environments (awt/swing, JavaFX), something that is very important to master in todays world. They would become able to see and analyze how their Rexx programs execute in such an environment. Currently, they can only try to code SAY debug statements scattered over their code and still will not have information like which activation owns at a certain point in the execution thread the lock and which ones are blocked etc. (The available multithreaded trace supplies exactly these insights which would help them immediately and a lot.) To sum up: the multithreaded trace is not meant to be active by default. A programmer/user needs to activate it and if he is doing so, there must be a good reason as the produced extra debug information may be overwhelming (but extremely helpful and time-saving). Any Rexx programmer/user that does not employ multithreading will never activate that option and everything remains untouched and the way it has been for the last 40 years. - If you have alternative ideas about how one could come up with the same information for multithreaded ooRexx programs in a manner, that you see more user friendly and carries comparable explanation and problem solving power then I really would be interested in it. The main use case is a deadlock due to multithreading issues in some ooRexx program(s). -rony
_______________________________________________ Oorexx-devel mailing list Oorexx-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/oorexx-devel