Re: [Oorexx-devel] An attempt to clarify (Re: The search order bug: a progress report, and some questions

Josep Maria Blasco Sun, 26 Feb 2023 04:36:49 -0800

Hi Rony,

Some more interesting data I collected today. Regina and the Windows
Command Prompt (CMD:EXE) *give exactly the same results when tested*.
Additionally, they are both directory-first SOAs.


This means that:

   - If the filename contains a relative path, only the current directory
   is searched ("If the command name does contain a path, the shell only
   checks the specified directory for a matching executable file", but "the
   specified directory", when it is relative, is resolved against the current
   directory).
   - Directory-first: "If the command name includes a file extension, the
   shell searches each directory for the exact file name specified by the
   command name. If the command name does not include a file extension, the
   shell adds the extensions listed in the PATHEXT environment variable, one
   by one, and searches the directory for that file name. Note that the shell
   tries all possible file extensions in a specific directory before moving on
   to search the next directory (if there is one)".

For CMD.EXE, the super-path is conceptually formed by prepending ".;" to
the PATH environment variable, and the extension list is taken from the
PATHEXT environment variable.

The directory exception algorithm is equivalent to finding a path separator
("\") in the filename, exactly what Regina does.

The directory extension algorithm is triggered when the supplied filename
has an extension, and then only this extension is searched.

REXXSAA (modulo the SAA bug) behaves in a way which is similar to Regina,
but its SOA is extension-first (and there are some other differences, like
the default extensions, etc).

*Comment*. *One way to interpret these results is to realize that the
object-oriented variants of Rexx, that is, Object Rexx and ooRexx, both
deviate from the behaviour of the command line interpreters, by
incorporating new options and possibilities that CMD.EXE does not offer.
Which in turn means that the study of the undeviated behaviour of the
command line interpreters can not help us much to fulfill our desire to
understand how ooRexx should behave, or to define and document things
according to that very same desire.*

*Sources:*

   - Windows NT Command Search Sequence
   
<https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/documents/Windows-NT-Command-Search-Sequence.md>
   .
   - Regina Rexx for Windows Search Order test results
   
<https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/windows.regina.results.txt>
   .
   - Windows NT Command Search Sequence test results
   
<https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/windows.cmd.results.txt>
   .


Kind regards,

  Josep Maria Blasco


Missatge de Josep Maria Blasco <[email protected]> del dia ds.,
25 de febr. 2023 a les 12:44:

> Dear Rony,
>
> Thank you for your very detailed post. You will have to pardon the
> belatedness of my reply: I wanted to do some research and perform some
> tests, to be able to reply to your mail as extensively as it deserves.
>
> I see that you have attempted to tackle the problem by comparing the
> behaviour of the command line processors under Windows and under the
> Unices. I'm not completely sure this is a good starting point. For some
> aspects of the problem, accepting that this is the right comparison will
> amount, in some senses that I'll be slowly detailing below, to begging the
> question.
>
> In particular, I'm not certain that we can so readily compare Rexx
> programs and the behaviour of *executables*. Rexx programs *are*, of
> course, executables. But, at the same time, Rexx programs are also *program
> source *(this is of course also true of all interpreted languages).
> Therefore, although we can probably learn and profitably borrow concepts
> and ideas from the executable program paradigm, we will also likely be able
> to learn and profitably borrow from the program source paradigm. The most
> likely is that when we'll have ended our investigation, we will come with a
> solution which takes some elements from (say) the behaviour of the command
> line interpreters, and some other elements from the behaviour of certain
> static language compilers, like gcc, or cl, and still, maybe, some other
> elements from the behaviour of more dynamic, i.e., interpreted, languages.
>
> Let's recapitulate. Please bear with me for a while. I think we'll gain a
> lot if we take an ample perspective and allow ourselves a high degree of
> abstraction. We'll descend into the necessary details later.
>
> *Definitions*. A *Search Order Algorithm* (SOA) is a procedure with the
> following arguments.
>
>    - A *filename*. This can be specified as name.ext, or it can also
>    include a (possibly relative) path specification (e.g,
>    some/path/name.ext).
>
>    - An ordered collection of ordered collections of (possibly only
>    partially specified) *directories* (or *paths*). One or more of these
>    directories can be *distinguished*, and constitute a fallback choice
>    when a *directory exception* is applied (see below).
>
>    A *super-path* is created by an order-preserving union of all the
>    ordered collections.
>
>    - Another ordered collection of ordered collections of (possibly
>    empty) *file extensions*.
>
>    A *super-list of extensions* is created by an order-preserving union
>    of all the ordered collections.
>
> The *goal *of the procedure is to locate and return the first file that
> matches the *filename*, resides in one of the *directories* and has one
> of the specified *extensions*.
>
> The search algorithm can be *extension-first* or *directory-first*. An
> *extension-first* algorithm searches for a file in all directories using
> the first of the extensions supplied; if not found, the search is initiated
> again, in all directories, with the second extension supplied, and so on. A
> *directory-first* algorithm first searches all extensions in the first
> supplied directory; if not found, it searches all extensions in the second
> supplied directory, and so on.
>
> In pseudo-code,
>
> Do dir Over directories *-- Directory-first search algorithm*
>   Do ext Over extensions
>      file = Check(dir, filename, ext)
>      If file \== "" Then Return file
>   End
> End
>
>
> and
>
> Do ext Over extensions *-- Extension-first search algorithm*
>   Do dir Over directories
>      file = Check(dir, filename, ext)
>      If file \== "" Then Return file
>   End
> End
>
>
> Every SOA algorithm can specify a procedure for *directory exceptions*,
> and another procedure for *extension exceptions*. The *directory
> exceptions* procedure can indicate that, instead of searching in all
> directories, the search will be limited to a designated subset of these
> same directories. The *extension exception* procedure can limit the
> search to a designated subset of the extensions, or require that the search
> is performed using no extension at all.
>
> A Search Order Algorithm is completely determined by its parameters, by
> the exception subalgorithms, if any, and by the fact that the search has to
> be performed extension- or directory-first.
>
> [*End of definition*]
>
> *Example 1. ooRexx*
>
> The ooRexx SOA is an *extension-first* SOA. The super-path is
> conceptually created as follows (assume the union method is order
> preserving) [rexxref 7.2.1.1]:
>
> Parse source . . myself
> mydir = Left(myself,Pos(.File~separator,myself)-1)
> directories = (mydir, ".")~ -     -- "same" directory and current directory
>   union( appDefinedPath )~ -      -- The EXTERNAL_CALL_PATH parameter
>   union( Value("REXX_PATH",,"ENVIRONMENT")~makeArray(.File~pathSeparator)
> )~ -
>   union( Value("PATH",,"ENVIRONMENT")~makeArray(.File~pathSeparator) ) -
>
>
> The super-list of extensions is conceptually created as follows:
>
> If REQUIRES_CALL Then extensions = .Array~of(".cls")
> Else extensions = .Array~new
> Parse source . . myself
> myname = Substr(myself,Pos(.File~separator,myself)+1)
> Parse Value myname~reverse With txe"."eman
> If eman \== "", txe \== "" Then extensions = extensions~union(
> "."txe~reverse )
> extensions = extensions~union( appDefinedExtensions )  -- The 
> EXTERNAL_CALL_EXTENSIONS
> parameter
> extensions = extensions~union( ".REX" )
> If \Windows Then extensions = extensions~union( ".rex" )
> extensions = extensions~union( "" )
>
>
> The directory exception algorithm currently returns .true when
>
> filename[1] == "~" | filename[1] == "/" | filename[1,2] = "./" |
> filename[1,3] == "../"
>
>
> for the Unices, and
>
> filename[1] == "\" | filename[2] == ":" | filename[1,2] = ".\" |
> filename[1,3] == "..\"
>
>
> for Windows, and then the search is limited to the distinguished
> directory, ".", i.e., to the current directory. The extension exception
> algorithm returns .true when the filename part of the file specification
> has a dot (".") in it.
>
> Please note that the directory exception algorithm is *undocumented*,
> while the extension exception algorithm is *documented *("If the routine
> name contains at least
> one period, then this routine is extension qualified" - rexxref 7.2.1.1).
>
> *Note: a bug in the Windows version of ooRexx. *The current
> implementation of the Windows extension exception algorithm, the boolean
> SysFileSystem::hasExtension function, incorrectly searches for "/"
> (instead of "\") as the path separator, so that the results of the function
> are, in some cases, incorrect. Please note that this an error that is
> difficult to trigger, since the algorithm runs backwards, examining all the
> characters in the full file specification, and returns .true (i.e., "has
> an extension"), if a dot is first found, and .false (i.e., "there is no
> file extension") if a separator is found, or when the string is exhausted.
> Files with a true extension will correctly return .true, and files with
> no extension will return .false (because the string will be exhausted,
> not because a path separator has been found, but who cares?)... unless one
> or more of the directories in the path contain a dot, e.g., in cases like
> "my.dir\file", or, of course, also "..\filename" (I've reported this bug
> in SourceForge [ticket 187o <https://sourceforge.net/p/oorexx/bugs/1870/>
> ]).
>
> *Example 2. Regina Rexx*
>
> The Regina Rexx SOA is a *directory-first* SOA. The super-path is
> conceptually formed as follows [regina.pdf 1.4.2]:
>
> directories = Value("REGINA_MACROS",,"ENVIRONMENT")~makeArray( 
> .File~pathSeparator
> )
> directories = directories~union( "." )
> directories = directories~union( Value("PATH",,"
> ENVIRONMENT")~makeArray(.File~pathSeparator) )
>
>
> The super-list of extensions is built as follows (to simplify, we will
> assume that REGINA_SUFFIXES is separated by commas, contains no blanks,
> and that all the extensions have their leading dot):
>
> extensions = .Array~of("")
> extensions =  extensions ~union( Value("REGINA_SUFFIXES",,"
> ENVIRONMENT")~makeArray(",") )
> extensions =  extensions ~union( (".rexx",".rex",".cmd",".rx") )
>
>
> The *directory exception* algorithm for Regina is very simple: when
>
> filename~contains( .File~separator )
>
>
> the search will be limited to the current directory. The *extension
> exception* algorithm reads as follows "If a known file extension is part
> of the file name only this file is searched", but I've not had the time to
> find out what this actually means.
>
> *Example 3. The concept of the "same" directory in C/C++ compilers*
>
> The concept of the "same" directory in C is *recursive* and, in the
> general case, may refer not to a single directory, but to a whole stack of
> directories
>
> "*The preprocessor searches for include files in this order: In the same
> directory as the file that contains the #include statement. In the
> directories of the currently opened include files, in the reverse order in
> which they were opened. The search begins in the directory of the parent
> include file and continues upward through the directories of any
> grandparent include files [...]*" (source
> <https://learn.microsoft.com/en-us/cpp/preprocessor/hash-include-directive-c-cpp?view=msvc-170>
> ).
>
>
> *Example 4. Extension-first, or directory-first*
>
> REXXSAA
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/documents/external-search-order-in-rexxsaa-for-os2.md>
> and OBJREXX
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/documents/external-search-order-in-objrexx-for-os2.md>
> for OS/2, and ooRexx
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/documents/external-search-order-in-oorexx.md>
> use *extension-first* SOAs. Regina Rexx
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/documents/external-search-order-in-regina.md>
> and the Windows CLI
> <https://learn.microsoft.com/en-us/previous-versions//cc723564(v=technet.10)#command-search-sequence>
>  (CMD)
> use *directory-first* SOAs.
>
> *Excursus: same and current, CALL and ::REQUIRES*
>
> In some contexts, the notion of the *same* directory does not make sense.
> For example, in a Windows command prompt, where there is no source file to
> begin with. In other contexts, it's the notion of the *current* directory
> that is not applicable. For example, when one compiles a C program, the
> current directory at the moment of compilation is normally irrelevant and
> often completely unrelated to the compilation at hand. Although one can, of
> course, force the compiler to explore the current directory (for example,
> by using the "-I." compiler option) if one so desires.
>
> The *definition* of the ::REQUIRES directive states that "The program
> *programname* is called as an external routine with no arguments"
> [rexxref 3.7], but this should not allow us to forget that *the ontology
> of ::REQUIRES and the ontology of CALL are, indeed, not quite equivalent*.
>
> ::REQUIRES speaks of *source files*, of referencing blocks of code which
> constitute libraries we want to use. The natural logic for ::REQUIRES is
> the logic of source files, where the *same* directory has great
> relevance, while the *current* directory should, in practice, be almost
> always irrelevant.
>
> CALL, on the other hand, is a more dynamic statement. One can find cases
> where it's natural to refer to the same directory, and other cases where
> referring to the current directory (or to some other directory specified in
> an environment variable like PATH) is more natural.
>
> The semantics of CALL can be dynamically patched, by altering the current
> directory, the environment variables, or both.
>
> The semantics of a ::REQUIRES directive is determined and cannot be
> patched.
>
> This is extremely unfortunate: since the directory exception algorithm is
> undocumented, the semantics of ::REQUIRES will appear to a programmer as
> (1) erratic, and (2) too dependent on an extraneous notion, like the
> current directory.
>
> *A test program*
>
> Now we can rely on the above definitions and start to test different
> implementations of Rexx (and maybe other products, like certain compilers
> or interpreters), to try to deepen our understanding of the problem at
> hand. I've written a version-independent test program that runs under
> various interpreters and operating systems, namely:
>
> *Operating systems:*
>
>    - OS/2 (Arca Noae 5.0.7).
>    - Windows (Windows 11 Pro).
>    - Linux (Ubuntu 22.04.01 LTS).
>
> *Interpreters:*
>
>    - IBM REXXSAA for OS/2 ("REXXSAA 4.00 3 Feb 1999,").
>    - IBM Object REXX for OS/2 ("OBJREXX 6.00 18 May 1999")
>    - Regina Rexx for Windows, Linux and OS/2 ("REXX-Regina_3.9.5(MT) 5.00
>    25 Jun 2022")
>    - ooRexx for Windows and Linux (" REXX-ooRexx_5.0.0(MT)_64-bit 6.05 23
>    Dec 2022")
>
> The test program, sotest.rex, uses the following directory structure:
>
> (*Root* directory. Normally, "sotest")
>    |
>    +---> sotest.rex (The test initiator program. Calls 
> ./subdir/dotdotsame/same/same.rex)
>    |
>    +---> *subdir* (Dummy directory, for future expansion)
>            |
>            +---> *dotdotsame* (The parent of the "same"or caller directorY)
>            |       |
>            |       +---> dotdotsame.rex (Returns "dotdotsame")
>            |       |
>            |       +---> *same* (The "same" or caller directory)
>            |               |
>            |               +---> same.rex (The program in the "same" or
>            |               |               caller directory. Returns "same")
>            |               +---> main.rex (The main program)
>            |               |
>            |               +---> *lib*
>            |                       |
>            |                       +---> samelib.rex (Returns "samelib")
>            |
>            +---> *dotdotcurr* (The parent of the current directory)
>            |       |
>            |       +---> dotdotcurr.rex (Returns "dotdotcurr")
>            |       |
>            |       +---> *curr* (The current directory)
>            |               |
>            |               +---> curr.rex (The program in the current
>            |               |               directory. Returns "curr")
>            |               +---> oorexxextensions (Extensionless. Returns
>            |               |               "directory")
>            |               +---> reginaextensions.rex (Returns
>            |               |               "directory")
>            |               +---> *lib*
>            |                       |
>            |                       +---> currlib.rex (Returns "currlib")
>            |
>            +---> *dotdotpath*
>                    |
>                    +---> dotdotpath.rex (Returns "dotdotpath")
>                    |
>                    +---> *path*
>                            |
>                            +---> path.rex (The program in the path
>                            |               directory. Returns "path")
>                            +---> oorexxextensions.rex (Returns
>                            |               "extension")
>                            +---> reginaextensions.rexx (Returns
>                            |               "extension")
>                            +---> *lib*
>                                    |
>                                    +---> pathlib.rex (Returns "pathlib")
>
> The main program, sotest.rex, immediately calls main.rex, located in the
> subdir/dotdotsame/same subdirectory, which then proceeds to test all kind
> of CALL statements: starting with "normal" calls, and following by more
> "pathological" ones. This allows us to know and tabulate the behaviour of
> the different interpreters.
>
> You can have a look at some test results here:
>
>    - Object Rexx for OS/2
>    
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/os2.objrexx.results.txt>
>    .
>    - Regina Rexx for OS/2
>    
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/os2.regina.results.txt>
>    .
>    - Classic Rexx for OS/2 (REXXSAA)
>    
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/os2.rexxsaa.results.txt>
>    .
>    - ooRexx for Ubuntu
>    
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/ubuntu.oorexx.results.txt>
>    .
>    - Regina for Ubuntu
>    
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/ubuntu.regina.results.txt>
>    .
>    - ooRexx for Windows, with the hasDirectory bug
>    
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/windows-bug.oorexx.results.txt>
>    .
>    - ooRexx for Windows, without the hasDirectory bug
>    
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/windows-nobug.oorexx.results.txt>
>    .
>    - Regina for Windows
>    
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/windows.regina.results.txt>
>    .
>
> The results of each run of the test program are themselves a Rexx routine
> that returns a stem; this facilitates enormously the comparison of results,
> their aggregation and tabulation, etc. You can browse a preliminary
> interpretation of the results here
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/OS2(REXXSAA,OBJREXX,Regina),Windows(ooRexx,Regina),Ubuntu(ooRexx,Regina).md>;
> I've copied the main results below, and I'll comment briefly on them.
>
> *The concept of the "same" directory*
>
> The concept of the "same" (or caller's directory) is *exclusive to ooRexx*.
> REXXSAA or OBJREXX for OS/2 don't have such a concept, and Regina doesn't,
> either.
>
> On the other hand, it's a concept which is very frequent when dealing with
> compilers. The Microsoft C/C++ compiler, for example, searches, by
> default and in the first place, in the same directory (source
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/documents/include-directive-in-visual-studio.md>
> ).
>
> *Comment*. *There is nothing "natural" or "Rexx-like" in the idea of a
> "same" directory. Most of the predecessors of ooRexx don't even have such
> an idea. My personal impression is that the concept of a "same" directory
> is a very welcome addition and improvement to the external search order
> algorithms in Rexx, but that the exact definition and behaviour of this
> "same" directory is more nuanced and subtle than one could think at first,
> and that it should be better delimited and defined. That's, after all, what
> we are trying here.*
>
> *See also*: the above discussion about the same and the current directory.
>
> *Note: a bug in the OS/2 REXXSAA interpreter*
>
> The definition of the external search order for the OS/2
> REXXSAA interpreter states that "REXX functions in the current directory,
> with the current extension" will be searched, but this doesn't seem to
> work
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/documents/external-search-order-in-rexxsaa-for-os2.md>.
> We will refer to this behaviour of the REXXSAA interpreter as *the SAA
> bug*.
>
> *The similarity of all classic Rexx interpreters*
>
> Interestingly, all the classic Rexx, or non-object oriented, interpreters
> behave identically, regardless of the operating system (i.e., OS/2, Windows
> or Linux) and the interpreter variant (REXXSAA, Regina), and modulo the SAA
> bug. Regina is explicit about its directory exception algorithm, as we have
> seen above: only the current directory is searched when the file
> specification contains a path delimiter. REXXSAA gives the same results as
> Regina, so that the exception algorithm must be similar. This allows us to
> group, in the results, all the non-object oriented interpreters in a single
> column, for comparison.
>
> *"Normal" calls*
>
> In the first tests, we try calling routines located in the same, current
> and path directories (0 means "fail", and 1 means "pass"). REXXSAA,
> Object Rexx and Regina do not have the notion of the "same" directory; all
> the other CALL tests pass.
>
> +-----------------+-----+-----+-----+
> | Call            | CLA | OBJ | OOR | CLA: CLAssic rexx interpreters,
> i.e., REXXSAA and Regina
> +-----------------+-----+-----+-----+
> | same.rex        |  0  |  0  |  1  | OBJ: IBM OBJect Rexx for OS/2
> | curr.rex        |  1  |  1  |  1  |
> | path.rex        |  1  |  1  |  1  | OOR: OORexx
> +-----------------+-----+-----+-----+
>
>
> *Downward-relative calls*
>
> Regina can't handle PATH-based downward-relative calls, because of its
> directory exception algorithm, and REXXSAA behaves similarly.
>
> +-----------------+-----+-----+-----+
> | Call            | CLA | OBJ | OOR |
> +-----------------+-----+-----+-----+
> | lib/samelib.rex |  0  |  0  |  1  |
> | lib/currlib.rex |  1  |  1  |  1  |
> | lib/pathlib.rex |  0  |  1  |  1  |
> +-----------------+-----+-----+-----+
>
>
>
> *Dot-relative calls*
>
> +-----------------+-----+-----+-----+
> | Call            | CLA | OBJ | OOR |
> +-----------------+-----+-----+-----+
> | ./samelib.rex   |  0  |  0  |  0  |
> | ./currlib.rex   |  1  |  1  |  1  |
> | ./pathlib.rex   |  0  |  1  |  0  |
> +-----------------+-----+-----+-----+
>
>
> Dot-relative call tests produce a matrix which is almost identical to the
> downward-relative calls; modulo the SAA bug
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/OS2(REXXSAA%2COBJREXX%2CRegina)%2CWindows(ooRexx%2CRegina)%2CUbuntu(ooRexx%2CRegina).md#the-saa-bug>,
> the "CLA" group (i.e., REXXSAA and Regina) exhibit the same behaviour (they
> only search in the current directory). The only test result matrix
> difference appears when the ooRexx interpreter is being tested: the ooRexx
> directory exception algorithm skips "./file", and does not search the
> same directory or thePATH.It's interesting to note that OBJREXX *does* search
> in thePATHin this case.
>
> *Upward-relative calls*
>
> +-----------------+-----+-----+-----+
> | Call            | CLA | OBJ | OOR |
> +-----------------+-----+-----+-----+
> | ../samelib.rex  |  0  |  0  |  0  |
> | ../currlib.rex  |  1  |  1  |  1  |
> | ../pathlib.rex  |  0  |  1  |  0  |
> +-----------------+-----+-----+-----+
>
>
> Dot-relative call tests produce a matrix which is almost identical to the
> downward-relative calls, and completely identical to the dot-relative
> calls; modulo the SAA bug
> <https://github.com/RexxLA/rexx-repository/blob/master/ARB/standards/work-in-progress/search-order/tests/results/OS2(REXXSAA%2COBJREXX%2CRegina)%2CWindows(ooRexx%2CRegina)%2CUbuntu(ooRexx%2CRegina).md#the-saa-bug>,
> the "CLA" group (i.e., REXXSAA and Regina) exhibit the same behaviour (they
> only search in the current directory). The only test result matrix
> difference appears when the ooRexx interpreter is being tested: the ooRexx
> directory exception algorithm skips "../file", and does not search the
> same directory or the PATH. It's interesting to note that OBJREXX *does* 
> search
> in the PATH in this case.
>
> *Upward-relative calls, with a trick*
>
> +------------------------+-----+-----+-----+
> | Call                   | CLA | OBJ | OOR |
> +------------------------+-----+-----+-----+
> | lib/../../samelib.rex  |  0  |  0  |  1  |
> | lib/../../currlib.rex  |  1  |  1  |  1  |
> | lib/../../pathlib.rex  |  0  |  1  |  1  |
> +------------------------+-----+-----+-----+
>
>
> The trick (go downwards first and then upwards twice) helps ooRexx to pass
> the tests, because ooRexx triggers the directory exception algorithm by
> inspecting *the first characters* of the filename only, but does not help
> with the "CLA" group (i.e., REXXSAA and Regina), because they search for a
> path separator *in the whole filename* (confirmed for Regina and true for
> REXXSAA according to reverse engineering).
>
> *Other tests*
>
> The other tests refer to more obscure, Windows- and OS/2-only variants of
> CALL, and will not be discussed here.
>
> Missatge de Rony G. Flatscher <[email protected]> del dia dc., 8 de
> febr. 2023 a les 14:08:
>
>> Dear Josep Maria,
>>
>> again thank you very much for your thorough write up!
>>
>
> I'll finish my reply by interspersing some random comments.
>
>> Having had a little bit more time and researching the Internet for
>> resources that explain/document in a brief, but professional manner the
>> terms and definitions that get used in this thread as this makes it better
>> researchable for others (e.g. I know that Windows by default will consult
>> the current directory when searching for executables, whereas in Unix
>> *usually* this is not the case such that one must add the current
>> directory symbolized as the dot '.' to the PATH).
>>
>> Here two resources which may help for this discussion:
>>
>>    - PATH (variable): <https://en.wikipedia.org/wiki/PATH_(variable)>
>>    <https://en.wikipedia.org/wiki/PATH_(variable)>
>>    - Path (computing): <https://en.wikipedia.org/wiki/Path_(computing)>
>>    <https://en.wikipedia.org/wiki/Path_(computing)>
>>
>> When I see the same word used for two different, but related, concepts,
> well... :)
>
>
>> Ad resolving Rexx programs, here a few terms that get used further down:
>>
>>    1. srcDir: the source directory of the Rexx program that currently
>>    gets run (one can get at it by extracting the path2pgm's location from
>>    'parse source . . path2pgm'
>>    2. currDir: the current working directory in which the Rexx program
>>    executes
>>
>> The current directory can be changed while a program executes, and
> therefore there's no such thing as a place where the program executes.
>
>
>>
>>    1. pathDir: the directories listed on the PATH environment variable
>>
>>    2. relativePath: any path that does not start with the root
>>    directory, which therefore gets resolved relative to currDir
>>
>> It's a little more complicated than that. "D:path\file.ext", for
> example, is relative, but relative to the current directory *of the D:
> drive*. In Windows, each drive has its own directory. Furthermore, and in
> general terms, thinking that a relative path has to be resolved relative to
> the current directory amounts to begging the question. I.e., it *assumes* that
> we are adopting a perspective identical to the current behaviour of the
> ooRexx interpreter, instead of adopting a more general perspective, to be
> able to carry on our investigation, and determine later whether the idea of
> a current directory is "natural", or "convenient", or, more in general,
> which place will we be asigning to this concept. As we have previously
> seen, there are search order contexts where the idea of a current directory
> does not apply, contexts where the "natural" meaning of a relative
> directory specification is the "same", or some other directory, and so on.
>
>
>>
>>    1. absolutePath: any path that starts out with the root directory,
>>    which therefore locates exactly the desired executable
>>
>> An obviousness: there is more than one root under Windows and OS/2.
>
>
>>
>>    1.
>>    2. unqualifiedExecutable: the name of an executable without path
>>    information
>>    3. relativeExecutables: the name of an executable with relative path
>>    information, i.e. relativePath
>>    4. absoluteExecutables: the fully qualified name of any executable,
>>    i.e. absolutePath
>>
>> Searching for executables via the operating system:
>>
>>    - Unix-like:
>>    - unqualifiedExecutables: get searched along the pathDir (PATH) in
>>       the order supplied, if not found an error gets raised
>>       - relativeExecutables: the supplied information gets appended to
>>       currDir (current working directory) and denotes the exact location of 
>> the
>>       executable, no further searches are undertaken and if not found an 
>> error
>>       gets raised
>>       - absoluteExecutables: denotes the exact location of the
>>       executable, no further searches are undertaken and if not found an 
>> error
>>       gets raised
>>
>>       - Windows:
>>       - unqualifiedExecutables:
>>       - first the current working directory gets searched for it and if
>>          not found
>>          - search along the pathDir (PATH) in the order supplied, if not
>>          found an error gets raised
>>          - relativeExecutables: the supplied information gets appended
>>       to currDir (current working directory) and denotes the exact location 
>> of
>>       the executable, no further searches are undertaken and if not found an
>>       error gets raised
>>
>> It's quite more complicated. Firstly, you have the case of drive-relative
> filenames, i.e. D:path/file.ext, where you can have several current
> directories, one for each drive. Second, a directory specified in the path
> can itself be relative. The clearer example of that is ".". You can add "."
> to the path under Unix, and get the same effect that under Windows, where
> the presence of "." in the path is implicitly assumed. Or you can add ".."
> to the path. In general terms, you can add a relative path to the PATH
> variable: then this path gets first prepended to the filename, and the
> result is again resolved against the current directory (double-relative
> resolution).
>
>>
>>    -
>>       - absoluteExecutables: denotes the exact location of the
>>       executable, no further searches are undertaken and if not found an 
>> error
>>       gets raised
>>
>> So the resolution of executables is the same on Unix and Windows except
>> for Windows first searching currDir (the current working directory) in the
>> case of unqualifiedExecutables.
>>
> The Windows world is a real nightmare. You have to add (1) several roots,
> which include (1.a) different drives (A:, B:, C:, etc.), and (1.b) UNC
> names, like \\server\share\path\name.ext. Each drive has a current
> directory. Then you have very strange concoctions, like filenames starting
> with "\\?\" (source
> <https://learn.microsoft.com/en-us/windows/win32/fileio/naming-a-file#win32-file-namespaces>),
> sometimes used to overcome the MAX_PATH limitations. Then there is still
> another prefix, "\\.\", to refer to Win32 devices. And so on. As I said, a
> real nightmare.
>
>
>> Conclusion #1: if relativeExecutables get searched and are not found
>> relative to the current working directory, then no further search takes
>> place and an error gets raised! This is how the operating systems behave,
>> no matter whether using a shell/terminal or system services.
>>
> Again, it's much more complicated. Under Windows, you have to take into
> account the value of the PATHEXT environment variable *to locate an
> executable*. To locate a DLL, other environment variables are used. Java
> programs use CLASSPATH, not PATH. ooRexx programs use REXX_PATH, in
> addition to PATH. Regina programs use REGINA_MACROS, in addition to PATH.
> ooRexx, when called programmatically, can specify an EXTERNAL_CALL_PATH
> that supplies an additional path. Similarly, there is
> a EXTERNAL_CALL_EXTENSIONS parameter, to specify additional extensions. And
> so on and on :)
>
>
>>
>> Conclusion #2: if non-operating system software behaves differently then
>> the operating system then this is the responsibility of that software and
>> needs to be documented. E.g. the observation that a compiler like gcc will
>> use the paths in some environment variables in a different manner (e.g.
>> using the value of the INCLUDE environment variable for locating c/cpp
>> include files), does not define/determine/change how PATH should get used
>> for locating unqualifiedExecutables.
>>
>>
>> Conclusion #3: if a Rexx CALL, that causes an external search, or an
>> ooRexx ::requires directive (the first time encountered will cause a CALL
>> of the denoted external file) get executed then the following rules
>> (should) apply:
>>
>>    - Rexx programs that are unqualifiedExecutables:
>>    - srcDir gets searched first and if not found
>>       - the operating system search for unqualifiedExecutables gets
>>       carried out next and if not found an error gets raised
>>       - Rexx programs that are relativeExecutables: the supplied
>>    information gets appended to currDir (current working directory) and
>>    denotes the exact location of the Rexx program, no further searches are
>>    undertaken and if not found an error gets raised
>>    - Rexx programs that are absoluteExecutables: denotes the exact
>>    location of the Rexx program, no further searches are undertaken and if 
>> not
>>    found an error gets raised
>>
>> Given these findings and conclusions it is probably a misconception of
>> expecting Rexx/ooRexx to behave like gcc (when employing the INCLUDE
>> environment variable directories), rather than like operating systems
>> resolve PATH.
>>
> That is the point where I get the impression that you're begging the
> question <https://en.wikipedia.org/wiki/Begging_the_question>. I've
> argued above in some detail why I think that we can't adhere too quickly to
> the command-line or executable paradigm. The comparison with gcc is not
> to say that ooRexx should work as gcc does, but to point out that it's
> not unthinkable that ooRexx worked as I mentioned.
>
> After all, Object Rexx for OS/2 does work as I mentioned. It searches
> "..\program.rex" against the current directory and against the PATH. Hey,
> it even searches against the same directory, if one puts it first in the
> path.
>
> Why and how has ooRexx, which after all is an evolution of Object Rexx,
> *lost* this capability? After all, losing a capability is a sad thing;
> it's better and easier not to use something that is offered to you than not
> being able to use it because there is a limitation.
>
> I can't reply to this question, but we can always guess. The Unix version
> of the interpreter has a routine, SysFileSystem::canonicalizeName, in
> whose description we read "*Process a file name to add the current
> working directory or the home directory, as needed, then remove all of the
> . and .. elements*". There's no corresponding routine for Windows: all
> the work is left to the SearchPath Windows API. But this API chokes on the
> "..\" and ".\" cases: see Erich's comment on bug ticket no. 1865
> <https://sourceforge.net/p/oorexx/bugs/1865/>: "*this is a restriction of
> the SearchPath Windows API. **It doesn't support searching for a filename
> with a leading .\ or ..\*".
>
> *My personal opinion* is that we should not be using this API. Not at
> all, since it doesn't work as expected. As expected by Erich, as expected
> by you yourself. As expected by me. As expected by anybody who reads the
> documentation in detail. As expected by any user of IBM Object Rexx for
> OS/2, which does not have this limitation. Hey, half of the work is already
> done, one only has to clone and adapt SysFileSystem::canonicalizeName from
> the Unixlike world.
>
> But this is only my opinion. I hope that the rest of the information that
> I have presented here helps to contribute to foster the debate.
>
> Best regards,
>
>   Josep Maria
>
>
>
>>
>> In the case of Rexx/ooRexx the documentation defines for
>> unqualifiedExecutables to first search srcDir and then pathDir. (Probably
>> it needs to be improved w.r.t. to the above as currently one can observe
>> quite some confusion even among long-time users while discussing this
>> issue.)
>>
>> Would that be applicable for your case as well?
>> Best regards
>>
>> ---rony
>>
>

_______________________________________________
Oorexx-devel mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/oorexx-devel

Re: [Oorexx-devel] An attempt to clarify (Re: The search order bug: a progress report, and some questions

Reply via email to