Dear Josep Maria,

again thank you very much for your thorough write up!

Having had a little bit more time and researching the Internet for resources that explain/document in a brief, but professional manner the terms and definitions that get used in this thread as this makes it better researchable for others (e.g. I know that Windows by default will consult the current directory when searching for executables, whereas in Unix /usually/ this is not the case such that one must add the current directory symbolized as the dot '.' to the PATH).

Here two resources which may help for this discussion:

 * PATH (variable): <https://en.wikipedia.org/wiki/PATH_(variable)>
 * Path (computing): <https://en.wikipedia.org/wiki/Path_(computing)>

Ad resolving Rexx programs, here a few terms that get used further down:

1. srcDir: the source directory of the Rexx program that currently gets run 
(one can get at it by
   extracting the path2pgm's location from 'parse source . . path2pgm'
2. currDir: the current working directory in which the Rexx program executes
3. pathDir: the directories listed on the PATH environment variable

4. relativePath: any path that does not start with the root directory, which 
therefore gets
   resolved relative to currDir
5. absolutePath: any path that starts out with the root directory, which 
therefore locates exactly
   the desired executable

6. unqualifiedExecutable: the name of an executable without path information
7. relativeExecutables: the name of an executable with relative path 
information, i.e. relativePath
8. absoluteExecutables: the fully qualified name of any executable,  i.e. 
absolutePath

Searching for executables via the operating system:

 * Unix-like:
     o unqualifiedExecutables: get searched along the pathDir (PATH) in the 
order supplied, if not
       found an error gets raised
     o relativeExecutables: the supplied information gets appended to currDir 
(current working
       directory) and denotes the exact location of the executable, no further 
searches are
       undertaken and if not found an error gets raised
     o absoluteExecutables: denotes the exact location of the executable, no 
further searches are
       undertaken and if not found an error gets raised

 * Windows:
     o unqualifiedExecutables:
         + first the current working directory gets searched for it and if not 
found
         + search along the pathDir (PATH) in the order supplied, if not found 
an error gets raised
     o relativeExecutables: the supplied information gets appended to currDir 
(current working
       directory) and denotes the exact location of the executable, no further 
searches are
       undertaken and if not found an error gets raised
     o absoluteExecutables: denotes the exact location of the executable, no 
further searches are
       undertaken and if not found an error gets raised

So the resolution of executables is the same on Unix and Windows except for Windows first searching currDir (the current working directory) in the case of unqualifiedExecutables.

---

Conclusion #1: if relativeExecutables get searched and are not found relative to the current working directory, then no further search takes place and an error gets raised! This is how the operating systems behave, no matter whether using a shell/terminal or system services.

Conclusion #2: if non-operating system software behaves differently then the operating system then this is the responsibility of that software and needs to be documented. E.g. the observation that a compiler like gcc will use the paths in some environment variables in a different manner (e.g. using the value of the INCLUDE environment variable for locating c/cpp include files), does not define/determine/change how PATH should get used for locating unqualifiedExecutables.

Conclusion #3: if a Rexx CALL, that causes an external search, or an ooRexx ::requires directive (the first time encountered will cause a CALL of the denoted external file) get executed then the following rules (should) apply:

 * Rexx programs that are unqualifiedExecutables:
     o srcDir gets searched first and if not found
     o the operating system search for unqualifiedExecutables gets carried out 
next and if not
       found an error gets raised
 * Rexx programs that are relativeExecutables: the supplied information gets 
appended to currDir
   (current working directory) and denotes the exact location of the Rexx 
program, no further
   searches are undertaken and if not found an error gets raised
 * Rexx programs that are absoluteExecutables: denotes the exact location of 
the Rexx program, no
   further searches are undertaken and if not found an error gets raised

Given these findings and conclusions it is probably a misconception of expecting Rexx/ooRexx to behave like gcc (when employing the INCLUDE environment variable directories), rather than like operating systems resolve PATH.

In the case of Rexx/ooRexx the documentation defines for unqualifiedExecutables to first search srcDir and then pathDir. (Probably it needs to be improved w.r.t. to the above as currently one can observe quite some confusion even among long-time users while discussing this issue.)

Would that be applicable for your case as well?

Best regards

---rony



On 07.02.2023 11:35, Josep Maria Blasco wrote:
Hello Rony,

I sent a previous reply, but it was intercepted by the SourceForge filters because it contained a zip file attachment. SourceForge informed me about this fact only some few hours ago (sigh). I will take the opportunity to compose a new, more thoughtful, reply.

Missatge de Rony G. Flatscher <rony.flatsc...@wu.ac.at> del dia dg., 5 de febr. 
2023 a les 19:40:

(snip)

        Not being the expert in this corner a question to you as you have 
developed a deep
        insight in the meantime: while developing on json.cls there are two 
json.cls present at
        the moment (Windows layout):

            C:\Program Files\ooRexx\json.cls
            C:\Program Files\ooRexx\rexxtry.rex

            F:\work\svn\oorexx\sandbox\rony\json\json.cls

            current directory is: F:\work\svn\oorexx\sandbox\rony\json

        Now while doing interactive tests via rexxtry.rex (i.e. "C:\Program
        Files\ooRexx\rexxtry.rex") doing a "call json.cls" will find and call 
"C:\Program
        Files\ooRexx\json.cls" and *not* "json.cls" in the current directory!

    Sure :) Rexxtry.rex is in the "C:\Program Files\ooRexx" directory. This is the 
"same"
    directory of the docs, or the "parent" directory of the source (I'd prefer 
to call it the
    "caller's" directory, I think it's a more accurate description; but I disgress). 
"Call
    json.cls" is one of the cases where Rexx works as documented. And the 
search order begins...
    by the "same" directory, as per the docs. That's why the version in "C:" is 
called.

        However, in the same rexxtry.rex session in the current directory doing a 
"call
        .\json.cls" *will* resolve and call 
"F:\work\svn\oorexx\sandbox\rony\json\json.cls": so
        "." will force using the current directory for finding external Rexx 
programs.

    That's exactly our bug.
    Hmm, maybe it is not a bug after all, but working as intended as this 
allows for overriding
    the default search order that starts out in the source directory (the 
directory the Rexx
    script got loaded and run from as in the case of rexxtry.rex, doing a 
"parse source . .
    path2source" would denote the Rexx program's source directory.)


I would object to that. If it were working, as you say, "as intended", then it should have been documented. But it is not. My impression is that /this is just a trick that works and that we are used to this trick/, to a point that we find it natural.

The search order algorithm, by definition, imposes a certain /opacity/ when there are duplicates. A super-path list is formed by concatenating the caller's directory, the present directory, the application-defined extra path, and the contents of REXX_PATH and PATH. Let p_1, ..., p_n be the component paths of this super-path list. Assume that in the search for name.ext we have several i in [1..n] such that p_i || sep || name.ext exists (sometimes it's a little more complicated than a simple concatenation, but allow me, for the sake of simplicity). Let i_1, ..., i_k be these indexes, where i_1 < i_2 <...<i_k. Then the fact that we find name.ext in p_{i_1} /turns all the other occurrences of /name.ext (namely, p_j || sep || name.ext, for j in [2..n]) /opaque/ /to the search algorithm/, and consequently also to ourselves as users of RexxTry or of a trace instruction.

We cannot expect that there is a special "trick" to refer to any of the p_{i_2}, ..., p_{i_j}. The fact that "." works /is a consequence of the malfunctioning of the search algorithm, not an escape mechanism that works as intended/.

/Unless/ we first /define/ it to be so, which we have not done.


    If I understand you properly, the only difference between this last test 
and the previous one
    is the addition of ".\". This is, precisely, one of the cases where what 
Rexx does and what's
    in the docs /differ/. ".\" should mean "in /this/ directory". But what does 
"this" mean,
    i.e., "this" relative to what?
    Probably "the current directory".


If you were using the command line, you'd be right. For a programming language, it's not so obvious. In any case, it's something that has to be (1) defined and (2) documented.

Let's see what other languages do.

The gcc compiler, for example, takes "." to mean "each and every directory specified with the -I" compiler option. Then, for example, include "./this.h" will refer to whatever directories have been specified. Each and every of them. "." is relative. I can provide testfiles if you want them (but not in a zipfile, as I've had to learn).

    #include <stdio.h>
    #include "./inc.h"


    int main() {

    printf("%s\n", THIS);

      return 0;

    }


All that main does is to print THIS, which obviously has to be defined somewhere else, i.e., in ./inc.h. Now, what does ./inc.h refer to? Well, it depends on what do we specify in the -I compiler option. Let's assume that we have two subdirectories called one and two. Now after compiling with

    gcc -Ione -o main main.c,


if we type "./main", we will get

    one,


assuming of course that inc.h in one #defines THIS as "one"; and after 
compiling with

    gcc -Itwo -o main main.c,

we will get

    two


assuming, once more, the necessary and symmetrical #defines. That's what (a compiler for) a compiled language does.

Let's see, on the other hand, what are the path calculations performed by Python, a dynamic, interpreted language:

    >>> os.getcwd()

    'D:\\Dropbox'
    >>> (Path("C:") / "./int.rex").resolve()
    WindowsPath('C:/Users/jmblasco/int.rex')


That is, "./", relative to "C:", is "C:/Users/jmblasco/int.rex", something relative to /the current directory of the C: drive/ (which happened to be C:\users\jmblasco), not something relative to the current directory (which resides in the D: drive).

    Well, according to the docs, we have to respect the search order. And, in 
the first place...
    "this" should be relative to the "same" directory, that is, the very same 
"same" directory
    (sorry). That's the caller's directory, that is, where rexxtry resides. 
BUT... here's this
    bug we are talking about: since the callee's filespec begins with ".\", the 
search path is
    bypassed (it should not be!). This means that the filespec is resolved 
(more or less) like in
    the command line (to be true, the resolution mechanism provided by the 
SearchPath Windows API
    --what Rexx for Windows uses-- differs from the command line one, with the 
clear intention to
    get the programmers and users altogether completely braindamaged, I'm 
obliged to presume).
    But when you're in the command line (and, in this case, when you call SearchPath), 
"." means
    the /current/ directory, not the /same/ or caller's directory. Hence, the 
F: version of
    json.cls gets called.

    This was actually the intention when doing a 'call ".\json.cls"', to override 
"json.cls" in
    rexxtry.rex' source directory. What would you suggest to use instead, a 
fully qualified path
    to ".\json.cls" (which one could hardly hard code in advance)? Maybe I have 
myself not fully
    understood all the ramifications of what you suggest, or why you would 
regard 'call
    ".\json.cls"' resolving the file in the current directory would be an error 
(for Rexx that is).

Well I would have called (temporarily) the new version "jsonnew", or "json2", or some such. Or I would have moved it somewhere else, for example to the parent directory relative to the old version, and then used call ..\json.cls. Or to a subdirectory of the same, and then used call subdir\json.cls, and so on. As I've said, the expectation that "." refers to the current directory is caused by the current behaviour of the interpreter, and is not according to the documentation.

Both gcc and python work in the following way: let pathlist = p_1,...,p_n be a path list, and let fn = "path\name.ext" be a relative or absolute path specification. Then

    Search(pathlist,fn) = fn, when fn is absolute, and
    Search(pathlist,fn) = the first (p_i * fn) such that exists,or
    Search(pathlist,fn) = .nil otherwise.


Now the "*" operation is, in general, a simple concatenation (adding a path separator if needed), i.e., "." * "name.ext" = ".\name.ext", where "." refers to the current directory, "C:\some\dir" * "..\name.ext" is "C:\some\dir\..\name.ext", that is "C:\some\name.ext", regardless of whether "C:" is the current drive or not, and, similarly, "C:\some\dir" * ".\name.ext" = "C:\some\dir\.\name.ext", that is, "C:\some\dir\name.ext", where the "." fragment operates as a no-op.

In some cases, "*" is not a direct concatenation, but a concatenation of the more composable parts of p_i and fn. For example, in Python, (Path("C:/one") / "/two/a.file").resolve() = WindowsPath('C:/two/a.file').

There are, of course, other algorithms; /they are all more complicated to describe than this one/, more full of exceptions, more difficult to understand and memorize, and in this respect, more error-prone. They also tend to exhibit subtle differences between them. For example, the SearchPath Windows API is more stringent than the Windows CLI. My claim is that the algorithm I am referring to is (1) the simplest non-banal one, and, in this sense, the easiest to remember for a programmer, and (2) the algorithm used in very well-known languages, which would help to lower the astonishment factor (see Mike's least astonishment principle).

Please note that the bug surfaced in the ".." case, not in the "." one, and applies also to the slash-relative and drive-relative paths. If the bug had existed only in the "." case, I'd agree that, being "." in the middle of a path a no-op, one could well /decide/ that it would mean "the current directory" when used as a prefix (but then one would also have to /document that decision/). But, in the ".." case, it seems obvious that the fact that the caller's directory is bypassed (and that all the other directories are also bypassed, except the current one) is a bug.

And it would be very difficult to justify handling the ".." case differently than the 
"." one.

    If you had instead tried "Call C:json.cls", you'd have had the C: version 
called. This is
    maybe the easiest way not to get lost: "Call C:json.cls", and "Call 
F:json.cls". Of course,
    the current directory of C: /and/ of F: have to point to the right places.

        So it seems that removing "." may have side effects that are not (yet) 
covered by the
        test cases?

    To the contrary: if you had tried the Ubuntu version of the interpreter 
after having applied
    my patch (https://sourceforge.net/p/oorexx/bugs/1865/?limit=25#687f), then "Call 
json.cls"
    and "Call ./json.cls" would have had the same effect. As one would expect, 
both would have
    called the version in the /same /directory. To call the version in the 
/current/ directory
    without using a complete, absolute path, you'd have had to resort to 
F:json.cls or the like :)

    If possible the lookup should work the same on Unix and Windows such that a 
Rexx programmer
    would not have to worry on which platform the program gets run.

Yes, of course, I agree completely with you. As I said above, I don't think one can maintain the expectation that there is a shortcut to overcome the opacity caused by the search algorithm.

One could devise a "full" search algorithm that returned, in a collection, all the possible "hits". But that's not what we have, at present.

You can, of course, always resort to Call (Directory()"\json.cls"). Or even create a variable V for this value and then Call (V).

    Personally I never use relative file paths if not absolutely necessary (and 
very rarely so,
    hence being more than rusty by now!). Rather than using relative paths in 
the Rexx programs I
    would adjust the PATH environment variable such that the desired search 
order is reflected via
    it and then start the Rexx program in that environment.

Well, I think there are cases when the use of upwards-relative paths (what initiated our conversation) is very reasonable. For example, when you have a master.cls somewhere, and you want to store all their subclasses in a subdirectory in a system- and path- independent way, it's very convenient to be able to write :.requires '../master.cls'.

I was using such constructions under Apache under Ubuntu 20.04 and Rexx 4.2.0. When I migrated to Ubuntu 22.04 and Rexx 5.0.0 (which also upgraded Apache), they stopped working. I don't know what happened; clearly, the current directory had changed, but I've not had the resources or the time to investigate why. It's not so clear to me what the current directory is supposed when you have a CGI called from an Apache action handler.

If the search order had worked as advertised, I would not have had any problem. That's what led me to investigate this aspect of search.

    If you look up the test framework and how to get to its Rexx 
packages/programs that is exactly
    what "setTestEnv.bat" (Windows) and "setTestEnv.sh" (Unix) do in 
"test/trunk". That is the
    reason why only unqualified file names are needed in all requires 
directives in the entire
    test package.

    But again, I am no expert her, so hoping that Rick or Erich can shed some 
more light on this.

    Best regards

    ---rony

Once more: I don't think there's a clear, evident way to settle this conversation. A /decision/ has to be taken. And it has to be /explained/ (i.e., documented) and, if possible, /justified/. The last part is optional, of course: one can define a language as one sees fit.

The weight, if any, of my contribution, is only to emphasize two things:

  * Other languages tackle this problem in a particular, coincident way.
  * And that way is the most economic in terms of describing the search 
procedure.

This does not mean that what I am proposing should be accepted. It's only my 
point of view.

  Josep Maria
_______________________________________________
Oorexx-devel mailing list
Oorexx-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Reply via email to