2017-03-18 13:16:56 -0400, Chet Ramey:
> On 3/17/17 5:51 PM, Stephane Chazelas wrote:
> 
> > Now, if that "split" functions is called from within a function
> > that declares $IFS local like:
>       [...]
> > because after the "unset IFS", $IFS is not unset (which would
> > result in the default splitting behaviour) but set to ":" as it
> > was before "bar" ran "local IFS=."
[...]

For bash, it looks like the boat has sailed as the issue has
been discussed before, but let me at least offer my opinion, and
also add the maintainer of yash and mksh in Cc so they can
comment as they have similar issues in their shell which they
may want to address (at least in the documentation). It's even
worse for mksh and yash as it's harder to work around there.

  For Yuki and Thorsten, see the start of the discussion at 
  https://www.mail-archive.com/bug-bash@gnu.org/msg19431.html
  (and
  https://www.mail-archive.com/miros-mksh@mirbsd.org/msg00697.html
  before that)

  In short, the issue is that "unset var" does not always leave
  $var unset (contrary to what the documentation or the name of
  the command suggest) but may instead restore a previous value.
  Reproducer for bash/pdksh/yash:

  $ f()(unset a; echo "$a"); g() { typeset a=2; f; }; a=1; g
  1

  For bash, also:

  $ f()(unset a; echo "$a"); a=1; a=2 f
  1

  One work around for bash is:

  f()(local a; unset a; echo "$a");  g() { typeset a=2; f; }; a=1; g

  as long as we want "$a" to be unset only for the local
  function (already enforced by the subshell in this case) or with
  bash/mksh/yash:

  f()(while [ "${a+set}" ]; do unset a; done
    echo "$a");  g() { typeset a=2; f; }; a=1; g

  (again, not likely to do what we want when not called in a
  subshell)

  (see also 
  f()(unset "a[0]"; echo "$a"); g() { typeset a=2; f; }; a=1; g
  in pdksh that doesn't unset "$a" but makes it an array with no
  element).
> 
> This is how local variables work.  Setting a local variable shadows
> any global with the same name; unsetting a local variable reveals the
> global variable (or really, because bash has dynamic scoping, the value
> at a previous scope) since there is no longer a local variable to shadow
> it.
[...]

Chet, the behaviour you describe above would be that of a "popvar"
(not "unset") command, an arcane command for an arcane feature:
pop a variable off a stack to restore the value (and attributes)
it had in an outer scope. A feature I would probable never need
in a million years. The only known usage of it being that hack
(http://www.fvue.nl/wiki/Bash:_Passing_variables_by_reference)
to be able to return a value into a variable passed as argument
to a function while still being able to use a local variable
with the same name in the function.

There is no way any sane person would write

   unset IFS

and mean anything else than unsetting the IFS variable (make
sure $IFS is not set afterwards so word splitting revers to the
default).

There's no way any sane person would expect that to mean
"restore the variable from an outer scope I don't known about"
(yash/pdksh) or in the case of bash: "restore the variable from
an outer scope unless I've declared it in the current function
context".

unsetting variables is an essential feature in shells as many
variables especially those in the environment have a special
meaning, affect the environment  when set. 

I can't imagine it being anything other than an unintended
accident of implementation, certainly not an intentional feature
of the language (at least not initially).

In all other languages that have a "unset"/"undef"/"delete"
similar feature (tcl, perl, php, ksh88 (dynamic scoping), ksh93
(static scoping), zsh, dash at least), unset unsets the variable
in the innest scope it has been declared in. I don't know of any
language that has a "popvar" feature to allow the user to unravel
the variable stack behind the back of the interpreter.

Several languages with static scoping (tcl with upvar, ksh93
with "typeset -n", python3 with nonlocal at least) have a way to
access variables in a parent scope, but with dynamic scoping,
there's no need for that. Child functions already have access to
the parent variables.

The issue (that there's no notion of variable reference in
those shells) that
http://www.fvue.nl/wiki/Bash:_Passing_variables_by_reference
tries to hack around is better addressed IMO with namespacing
(like return the value in a dedicated variable (REPLY for
instance is already used for that internally in bash and several
other shells) or make sure utility functions that modify
arbitrary variables use a dedicated prefix for their own
variables))

In any case, even if that was an essential feature, it would not
be a good reason for breaking the "unset" command or at least
subvert its meaning. Implementing "typeset -n" like in ksh93 or
an "upvar" builtin a la tcl would make a lot more sense IMO.

On comp.unix.shell ot http://unix.stackexchange.com, I've posted
many articles describing how to do splitting in POSIX-like
shells:

( # subshell for local scope
  unset -v  IFS # restore default splitting behaviour
  set -o noglob # disable globbing
  cmd -- $var   # split+glob with default IFS and glob disabled
)

I'm now considering adding a note along the lines of:

  "Beware that with current versions of bash, pdksh and yash,
  the above may not work if used in scripts that otherwise use
  typeset/declare/local on $IFS or call a function with
  `IFS=... my-function' (or IFS=... eval... or IFS=...
  source...)"

-- 
Stephane

Reply via email to