2017-03-18 13:16:56 -0400, Chet Ramey: > On 3/17/17 5:51 PM, Stephane Chazelas wrote: > > > Now, if that "split" functions is called from within a function > > that declares $IFS local like: > [...] > > because after the "unset IFS", $IFS is not unset (which would > > result in the default splitting behaviour) but set to ":" as it > > was before "bar" ran "local IFS=." [...]
For bash, it looks like the boat has sailed as the issue has been discussed before, but let me at least offer my opinion, and also add the maintainer of yash and mksh in Cc so they can comment as they have similar issues in their shell which they may want to address (at least in the documentation). It's even worse for mksh and yash as it's harder to work around there. For Yuki and Thorsten, see the start of the discussion at https://www.mail-archive.com/bug-bash@gnu.org/msg19431.html (and https://www.mail-archive.com/miros-mksh@mirbsd.org/msg00697.html before that) In short, the issue is that "unset var" does not always leave $var unset (contrary to what the documentation or the name of the command suggest) but may instead restore a previous value. Reproducer for bash/pdksh/yash: $ f()(unset a; echo "$a"); g() { typeset a=2; f; }; a=1; g 1 For bash, also: $ f()(unset a; echo "$a"); a=1; a=2 f 1 One work around for bash is: f()(local a; unset a; echo "$a"); g() { typeset a=2; f; }; a=1; g as long as we want "$a" to be unset only for the local function (already enforced by the subshell in this case) or with bash/mksh/yash: f()(while [ "${a+set}" ]; do unset a; done echo "$a"); g() { typeset a=2; f; }; a=1; g (again, not likely to do what we want when not called in a subshell) (see also f()(unset "a[0]"; echo "$a"); g() { typeset a=2; f; }; a=1; g in pdksh that doesn't unset "$a" but makes it an array with no element). > > This is how local variables work. Setting a local variable shadows > any global with the same name; unsetting a local variable reveals the > global variable (or really, because bash has dynamic scoping, the value > at a previous scope) since there is no longer a local variable to shadow > it. [...] Chet, the behaviour you describe above would be that of a "popvar" (not "unset") command, an arcane command for an arcane feature: pop a variable off a stack to restore the value (and attributes) it had in an outer scope. A feature I would probable never need in a million years. The only known usage of it being that hack (http://www.fvue.nl/wiki/Bash:_Passing_variables_by_reference) to be able to return a value into a variable passed as argument to a function while still being able to use a local variable with the same name in the function. There is no way any sane person would write unset IFS and mean anything else than unsetting the IFS variable (make sure $IFS is not set afterwards so word splitting revers to the default). There's no way any sane person would expect that to mean "restore the variable from an outer scope I don't known about" (yash/pdksh) or in the case of bash: "restore the variable from an outer scope unless I've declared it in the current function context". unsetting variables is an essential feature in shells as many variables especially those in the environment have a special meaning, affect the environment when set. I can't imagine it being anything other than an unintended accident of implementation, certainly not an intentional feature of the language (at least not initially). In all other languages that have a "unset"/"undef"/"delete" similar feature (tcl, perl, php, ksh88 (dynamic scoping), ksh93 (static scoping), zsh, dash at least), unset unsets the variable in the innest scope it has been declared in. I don't know of any language that has a "popvar" feature to allow the user to unravel the variable stack behind the back of the interpreter. Several languages with static scoping (tcl with upvar, ksh93 with "typeset -n", python3 with nonlocal at least) have a way to access variables in a parent scope, but with dynamic scoping, there's no need for that. Child functions already have access to the parent variables. The issue (that there's no notion of variable reference in those shells) that http://www.fvue.nl/wiki/Bash:_Passing_variables_by_reference tries to hack around is better addressed IMO with namespacing (like return the value in a dedicated variable (REPLY for instance is already used for that internally in bash and several other shells) or make sure utility functions that modify arbitrary variables use a dedicated prefix for their own variables)) In any case, even if that was an essential feature, it would not be a good reason for breaking the "unset" command or at least subvert its meaning. Implementing "typeset -n" like in ksh93 or an "upvar" builtin a la tcl would make a lot more sense IMO. On comp.unix.shell ot http://unix.stackexchange.com, I've posted many articles describing how to do splitting in POSIX-like shells: ( # subshell for local scope unset -v IFS # restore default splitting behaviour set -o noglob # disable globbing cmd -- $var # split+glob with default IFS and glob disabled ) I'm now considering adding a note along the lines of: "Beware that with current versions of bash, pdksh and yash, the above may not work if used in scripts that otherwise use typeset/declare/local on $IFS or call a function with `IFS=... my-function' (or IFS=... eval... or IFS=... source...)" -- Stephane