(Not Texinfo related: please ignore if you're not interested.) Hello Gavin,
In reply of: http://lists.gnu.org/archive/html/bug-texinfo/2015-09/msg00115.html I has been a long time since I should have answered about what evolution I was thinking of concerning interacting with external commands... Well there are two points about hooking how the shell interacts with external commands: - environment - command arguments On the second point I think that, at least in bash, there is already some provision for making such user hooks. Imagine you have some command foo.exe, and you want a hook to prefix the 1st argument by a + sign before calling the command, you can still write in your .bash_profile (not tested): --8<----8<----8<----8<----8<-- begin -->8---->8---->8---->8---->8---- function foo(){ arg1=$1 shift command foo.exe "+$arg1" "$@" } export -f foo --8<----8<----8<----8<----8<-- end -->8---->8---->8---->8---->8---- Well `command' is a bash builtin (cf. info node "(bash) Bash Builtins"), I don't know whether sh has the same thing... However, even though such provision exists, it is not sufficient to make generic user hooks that: - would be called instead of a command, whether this command is an executable, a builtin, or a script - would be called based some some condition matching the command name There could be if you need to hook all the commands the name of which starts with f and does not end with d the following sort of syntax for the hook definition: --8<----8<----8<----8<----8<-- begin -->8---->8---->8---->8---->8---- hook hookname() when [[ $0 == f* && $0 != *d ]]; at 0; { local arg1 arg1=$1 shift "$0" "+$arg1" "$@" } export -h hookname --8<----8<----8<----8<----8<-- end -->8---->8---->8---->8---->8---- Where this syntax --8<----8<----8<----8<----8<-- begin -->8---->8---->8---->8---->8---- when [[ $0 == f* && $0 != *d ]]; --8<----8<----8<----8<----8<-- end -->8---->8---->8---->8---->8---- is the condition when hook named `hookname' is executed instead of command. and there is a specification for order --8<----8<----8<----8<----8<-- begin -->8---->8---->8---->8---->8---- at 0 --8<----8<----8<----8<----8<-- end -->8---->8---->8---->8---->8---- meaning that hookname is the first hook that is tried for condition (there could be other type of specification like `after somehook' or `before somehook', or using negative index like `at -1' for created in last position, `at -2' one position before last, etc... the `export -h' would be to tell to export this hook to child process (if another bash script is subshelled). Command hooks would be examined immediately after the shell has prepared the command line (path to command and arguments). That may be before the fork used for creating the child process in which the command is run. Now this was the easy part of it, concerning the first point, environment, ie the environment translation --- and just as a reminder this is where texi2dvi script has some limitation when running over MSYS --- this is more thorny. You noted in your latest email: > I understand when a shell launches a process, it forks (creating a > copy of the shell process), sets up the environment for the process > (for example environment variables and file descriptors), and then > uses the "exec" call to replace itself with the program being > launched. What would be interesting would be if there was a way to > intervene after the fork, but before the exec. That was not really the idea I had in mind. What you were considering is some way to translate the environement from MSYS format to the MSW native format when native commands are invoked. Instead my idea was that the environment would be « unchanged » by MSYS, *BUT* MSYS bash scripts and Msys application would access it through translator objects. Let us consider some fancy silly example for the sake of explanation. Imagine some envvar FOO, the value of which is "bar" in native format, but it MSYS format I need to suffix a "t" to all values so the value in MSYS format has to be "bart". Syntactically I would add to my MSYS .profile or .bash_profile the following statements --8<----8<----8<----8<----8<-- begin -->8---->8---->8---->8---->8---- # Here we declare a class to translate access to variable FOO envtranslation FOO_TRANSLATE { # declare member variables. -m is a new option for declare. declare -m native # previous value of $this declare -m cache # cached translation # read the envvar (accessed in $this special member variable of this # object) with translating it ("t" appending). A cache technique is # used to make the translation only when a $this has a new value function get() { if [ "$native" != "$this" ]; then # do the translation native=$this cache=${native}t fi # $cache is the got value, `got' is a novel keyword got $cache } # set the envar to a new value. $1 is the new MSYS value. function set() { # here we need to remove the trailing "t" this=${1:0:-1} native=$this cache=$1 } } # now tell MSYS that from now on FOO has to be converted via the # FOO_TRANSLATE class. -c is a new option for declare declare -c FOO_TRANSLATE FOO --8<----8<----8<----8<----8<-- end -->8---->8---->8---->8---->8---- We also have a default translation class that is the class of of envvars that have not been declared with `declare -c ....', like this: --8<----8<----8<----8<----8<-- begin -->8---->8---->8---->8---->8---- envtranslation default { function get() { got $this } function set() { this=$1 } } --8<----8<----8<----8<----8<-- end -->8---->8---->8---->8---->8---- The default class is useful if I want to stop translating some envvar. So what will happen, assume that the current native value of FOO is "bar", and we have three cases: - access by the shell script - access by a native command - access by an MSYS command access by the shell script ~~~~~~~~~~~~~~~~~~~~~~~~~~ In case FOO is read in the shell script itself, and after the `declare -c FOO_TRANSLATE FOO' statement, for instance by this statement --8<----8<----8<----8<----8<-- begin -->8---->8---->8---->8---->8---- BART=$FOO --8<----8<----8<----8<----8<-- end -->8---->8---->8---->8---->8---- then the `get' method of the FOO_TRANSLATION class is called by bash, then variable BART gets value "bart" instead of "bar". Similarly, the following statements in a script --8<----8<----8<----8<----8<-- begin -->8---->8---->8---->8---->8---- FOO="guillemot" native-command.exe --8<----8<----8<----8<----8<-- end -->8---->8---->8---->8---->8---- Will set "guillemo" into the envar "FOO", as the `set' method is called the last char is removed. So, the native command native-command.exe called in the sequel will get "guillemo" if it calls `getenv("FOO")'. access by native commands ~~~~~~~~~~~~~~~~~~~~~~~~~ See example above, native command get the values in native format, because any assignment in the shell calls the `set' method which does the translation from MSYS format to native format. access by MSYS commands ~~~~~~~~~~~~~~~~~~~~~~~ The idea is that the following data is inherited by subshell calling command, similar to inheriting the environment: - the environment translation class (ETC) definition byte code - mapping of ETC to envvar - for each envvar ETC objet, member attribute: e.g. attributes `cache' and `native' in the case of FOO_TRANSLATION would have to be inherited for each envvar which, like FOO, has been declared of FOO_TRANSLATION ETC. Now, when an MSys command is called any invocate of getenv or setenv (ie in the subshell after the exec is called) will check the ETC mapping, and if the ETC is different from `default', will run the bytecode interpreter for executing `get' or `set' respectively. The problem with the above idea is that you have a major backward compatibility issue in that you have to recompile all the Msys command with the new getenv & setenv implementation. In your idea (conversion done somewhere between fork and exec) that would not be the case, but when there is a large amount of envvars you need to make all the translation every time even for all these variables which the native command does not need. Another issue with your method is that anyway bash would need to know whether a command is an MSys command or not. I don't know how it works currently, I suspect that in the process of subshelling a command there is already some way to detect whether it is Msys command or not. Otherwise, your method would suffer the same backward compatibility issue. VBR, Vincent.
