Re: parameter expansion null check fails for arrays when [*] or [@] is used

2022-03-23 Thread Ilkka Virta
On Tue, Mar 22, 2022 at 11:52 PM L A Walsh  wrote:

> On 2022/03/21 05:45, Andreas Luik wrote:
> > Description:
> >   Bash fails to correctly test for parameter to be unset or null
> when the parameter is an array reference [*] or [@].
> >
> > Repeat-By:
> >
> > myvar[0]=
> > echo "${myvar[0]:+nonnull}"
> >  -> OK
> > echo "${myvar[*]:+nonnull}"
> > nunnull -> not OK, because "${myvar[*]}" is null
> >
> myvar[*] = ('', )  element 0 contains an empty string, so not null.
>

The POSIX phraseology is that "null" means the empty string.
${var:+text} tests if var is unset or null, ${var+text} only tests if it's
unset.
https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/V3_chap02.html#tag_18_06_02

"${myvar[*]}" produces the empty string, same as "${myvar[0]}", so it is
a bit odd if using the :+ modifier on them gives different behaviour.

Compare e.g.

a="${myvar[*]}"
echo "${a:+nonnull}"
# vs.
echo "${myvar[*]:+nonnull}"

Not sure what "${array[@]:+text}" should even do though. With @, it should
result in one word per element, but how that should combine with :+ I have
no idea.

It doesn't matter, though, Bash 5.0 looks to give the empty output for all
of those.


Re: Interesting bug

2022-02-12 Thread Ilkka Virta
On Sat, Feb 12, 2022 at 8:25 PM David Hobach  wrote:

> I guess it closes the function and {echo is interpreted as string or so,
> but that is probably not all (testCode is e.g. never executed).
>

Let's see if I get this right.

Removing the irrelevant parts, your code is pretty much this:

func() {
{echo hello;}
{ echo funny stuff
  exit 1
}
}

The key here is that '{' is a keyword, like 'if', or 'do', not an operator
like '('. Which just might be
quite different in other programming languages, but here we are. So,
having '{echo' instead of
'{' is a bit like having 'fiecho' instead of 'fi'. Not relevant for the
syntax, so that the first '}' just after
then ends the function and the block with "echo funny stuff" is on the top
level.

You could have this instead, to the same effect:

func() {
{echo hello
}
if true; then
echo funny stuff
exit 1
fi
fi

Without the explicit "exit", you'd get the syntax error after the "echo
funny stuff" was run. The function itself
is never called, so "echo hello", or whatever there is, never runs.

On the other hand, the shell parses lines in full before doing anything
(and it needs to look if there's a redirection
after the block), so with '} }' or 'fi fi' on one line instead, the syntax
error is caught before the block runs.

For fun, try that with 'fi fi', 'fi; fi', 'fi; xyz' and 'fi xyz' and see
which versions run the block (and try to figure out why)


Re: I've found a vulnerability in bash

2021-11-19 Thread Ilkka Virta
On Fri, Nov 19, 2021 at 12:53 PM Marshall Whittaker <
marshallwhitta...@gmail.com> wrote:

> You could argue that bash should parse filenames globbed from * that start
> with - and exclude them specifically,
>

Or a shell could prepend ./ to all globs relative globs. Not sure if that
would change the behaviour of some
program though.

But you're free to write a shell or a patch to do something like that, and
see if it gets any traction? I know at least
zsh has some features to warn about doing things like rm *, but at least
the version I tried doesn't seem to check
for filenames that look like options.

Though of course there's also the issue that some utilities take as options
things that start with a plus, also. Like
Bash's +O.


> A short whitepaper on it has been made public at:
> https://oxagast.org/posts/bash-wildcard-expansion-arbitrary-command-line-arguments-0day/
> complete with a mini PoC.
>

Given I just linked you two posts about that from 11 years ago, I fail to
see how you could honestly consider that
a "0-day" issue. Not that people falling into a decades-old trap is much
better, actually, so it probably wouldn't be
a bad thing if shells started warning about that.


Re: I've found a vulnerability in bash

2021-11-17 Thread Ilkka Virta
On Wed, Nov 17, 2021 at 2:42 PM Marshall Whittaker <
marshallwhitta...@gmail.com> wrote:

> [marshall@jerkon]{04:09 AM}: [~/bashful] $ touch -- '--version'
> [marshall@jerkon]{04:09 AM}: [~/bashful] $ rm *
> rm (GNU coreutils) 8.30
> Copyright (C) 2018 Free Software Foundation, Inc.
> License GPLv3+: GNU GPL version 3 or later <
> https://gnu.org/licenses/gpl.html>;.
> This is free software: you are free to change and redistribute it.
> There is NO WARRANTY, to the extent permitted by law.
>
> Written by Paul Rubin, David MacKenzie, Richard M. Stallman,
> and Jim Meyering.
> [marshall@jerkon]{04:09 AM}: [~/bashful] $
>

A common pitfall, due to how the utility can't tell what strings come from
globs and what
were given literally. See e.g.
https://unix.stackexchange.com/questions/1519/how-do-i-delete-a-file-whose-name-begins-with-hyphen-a-k-a-dash-or-minus
and https://dwheeler.com/essays/filenames-in-shell.html (though the latter
is rather long and depressing.)

I don't see this in BashFAQ, though. Is it because it's not strictly about
Bash? Greg?

Also, GNU rm has a helpful helptext about it:

$ rm --help
Usage: rm [OPTION]... [FILE]...
Remove (unlink) the FILE(s).

[...]

To remove a file whose name starts with a '-', for example '-foo',
use one of these commands:
  rm -- -foo

  rm ./-foo

Note that if you use rm to remove a file, it might be possible to recover
some of its contents, given sufficient expertise and/or time.  For greater
assurance that the contents are truly unrecoverable, consider using shred.


Re: bash conditional expressions

2021-11-17 Thread Ilkka Virta
On Wed, Nov 17, 2021 at 1:33 PM Andreas Schwab 
wrote:

> On Nov 17 2021, Michael J. Baars wrote:
>
> > When -N stands for NEW, and touch (-am) gives you a new file
>
> It doesn't.  The file hasn't been modified after it was last read.
>

touch creates the given file if it doesn't previously exist. Immediately
afterwards,
it could be called "new" in the usual English meaning, and would be new in
the
sense that nothing was done to it after it was created. But:

$ echo $BASH_VERSION
5.1.8(3)-maint
$ rm foo.txt
$ ls -l foo.txt
ls: cannot access 'foo.txt': No such file or directory
$ touch -am foo.txt
$ if test -N foo.txt; then echo is new; else echo is NOT new; fi
is NOT new

Of course "new" is not an exact concept, it could be defined e.g. to
compare the
file timestamps with the current time.

Anyway, the documentation doesn't seem to say 'test -N' tests if the file
is "new".


Re: bash conditional expressions

2021-11-15 Thread Ilkka Virta
On Mon, Nov 15, 2021 at 5:21 AM Greg Wooledge  wrote:

>relatime
>   Update inode access times relative to  modify  or  change
> time.
>   Access time is only updated if the previous access time was
> ear‐
>   lier than the  current  modify  or  change  time.
>

I guess that's the important part anyway. But it doesn't look exactly
right: at least on my
system it updates atime also if atime == mtime. So you always get to see if
the file was
read after the last time it was written to.

A newly-created file would have mtime == atime, and if test -N requires
mtime > atime,
it won't find those. Not sure if it should.


Re: bash conditional expressions

2021-11-12 Thread Ilkka Virta
On Fri, Nov 12, 2021 at 3:22 PM Mischa Baars 
wrote:

> It is how my makefiles work :)
>

Sounds to me you're not using Make, but some self-made tool, so the files
you have would be more
properly called build scripts or whatever, and not makefiles.


Re: Arbitrary command execution from test on a quoted string

2021-10-29 Thread Ilkka Virta
On Fri, Oct 29, 2021 at 1:01 AM elettrino via Bug reports for the GNU
Bourne Again SHell  wrote:

> user@machine:~$ USER_INPUT='x[$(id>&2)]'
> user@machine:~$ test -v "$USER_INPUT"
> uid=1519(user) gid=1519(user) groups=1519(user),100(users)
>

What you're doing here, is having the user name a variable, and then
testing if that variable is set.
I'm not sure if that makes much sense. The user probably doesn't and
shouldn't need to know the names
of the variables used by the script.

It might make more sense to use USER_INPUT as an index to an associative
array that was filled with
some relevant entries and the user was to pick one. But you still get to
watch the quoting:

$ declare -A values=([foo]=123 [bar]=345)
$ USER_INPUT='x[$(id>&2)]'; test -v 'values[$USER_INPUT]' && echo yes ||
echo no
no
$ USER_INPUT='foo'; test -v 'values[$USER_INPUT]' && echo yes || echo no
yes

(or do the same with [ "${values[$USER_INPUT]+set}" = set ] )

but

$ USER_INPUT='x[$(id>&2)]'; test -v "values[$USER_INPUT]" && echo yes ||
echo no
uid=1000(itvirta) gid=1000(itvirta) ...
no

Not that I'm sure the upper one is still safe against every input. I think
issues with associative array keys have been
discussed on the list before.

I don't know whether this happens with anything other than the -v option
> with test; I have not seen it happen under any other circumstance.
>

Arithmetic expansion is the classic one. Here, we expect the user to give
some number and then do arithmetic on it:

USER_INPUT='x[$(id>&2)]'
a=$(( USER_INPUT + 1 )) # or even:
if (( USER_INPUT <= 0 )); then echo invalid input; fi

You have to sanitize the inputs, case $USER_INPUT in *[!0-9]*) echo error
>&2; exit 1 ;; esac or something like that for the numbers.


Re: Misleading error when attempting to run foreign executable

2021-10-04 Thread Ilkka Virta
On Mon, Oct 4, 2021 at 4:46 PM Chet Ramey  wrote:

> Bash reports the error it gets back from execve. In this case, it's
> probably that the loader specified for the executable isn't present on your
> system.
>

OTOH, for a script, Bash checks to see if the problem is with the
interpreter and reports accordingly:

 $ ./foo.sh
bash: ./foo.sh: /bin/noexist: bad interpreter: No such file or directory

The shell does go on to stat() the file after getting ENOENT from execve(),
so I suppose it could
add some clarifying note to the error message for the case of a binary file
too.


Re: Exclamation mark when using character classes

2021-08-21 Thread Ilkka Virta
What do you get with [![:digit:]] then? It seems to work the same with both
! and ^ here:

$ now=$EPOCHREALTIME
$ echo "${now%[^[:digit:]]*}" "${now#*[^[:digit:]]}"
1629544775 183030
$ echo "${now%[![:digit:]]*}" "${now#*[![:digit:]]}"
1629544775 183030




On Fri, Aug 20, 2021 at 10:30 PM hancooper via Bug reports for the GNU
Bourne Again SHell  wrote:

> I am using EPOCHREALTIME and then computing the corresponding human
> readable form, that can handle
> changes in locale
>
> now=$EPOCHREALTIME
> printf -v second '%(%S)T.%s' "${now%[^[:digit:]]*}" "${now#*[^[:digit:]]}"
> printf -v minute '%(%M)T' "${now%[^[:digit:]]*}"
> printf -v hour '%(%H)T' "${now%[^[:digit:]]*}"
>
> Incidentally, [![:digit:]] does not work there, you need to use the
> POSIX-specified caret (^) instead of an
> exclamation mark when using character classes. I'm not sure if this is
> intentional or a bug in bash; man
> page doesn't seem to mention it.


Re: EPOCHREALTIME

2021-08-19 Thread Ilkka Virta
On Thu, Aug 19, 2021 at 4:12 PM hancooper  wrote:

> On Thursday, August 19, 2021 12:58 PM, Léa Gris 
> wrote:
> > (LC_NUMERIC=C; echo "$EPOCHREALTIME")
>
> the unix time stamp is merely the number of
> seconds between a particular date and the epoch.  Technically, it should
> be pointed out
> that the time does not change no matter where you are located on the globe.
>
> Thusly, EPOCHREALTIME should not be made to depend on the locale.
>

The locale setting has more to do with which language you speak, rather
than where you are
(that would be the timezone). Even if the number is the same, it's
represented differently in
different languages, e.g. in Finnish, the comma would be the proper
separator, regardless
of if we speak Finnish in Finland, or in the US.

Regardless, one could argue that using the locale separator here is
counterproductive since
other utils, ones that can actually do calculations on the decimal number
(like bc), might not
support it. But for now, you can always use `${EPOCHREALTIME/,/.}` if you
need it with the
dot, no need to change locales.


@K transformation

2021-08-19 Thread Ilkka Virta
On Thu, Aug 19, 2021 at 5:49 AM Koichi Murase 
wrote:

> FYI, zsh provides this feature for associative arrays with the syntax
> ${(kv)assoc} or ${(@kv)assoc}. Note that Bash-5.1 already has a
> similar feature ${array[@]@K}, but the substitution result is quoted
> so cannot be directly used for the present purpose.
>

 $ declare -A A=([foo bar]="123 456" [adsf]="456 789")
$ printf "<%s>\n" "${A[@]@K}"


Interesting. I wonder, what's the intended use-case for this?

The impression I have is that it's easier to turn a list of multiple words
into one
string (e.g. for display purposes), but much harder to keep things as
distinct
words (for pretty much anything else). So this seems to have a somewhat
limited
usefulness. I can see "${A[*]@K}" producing a single string like that, but
the same
with [@] seems odd compared to how [@] expansions otherwise work.


Re: bash-5.1.8 does not compile anymore on HP-UX due to invalid shell syntax

2021-08-17 Thread Ilkka Virta
On Tue, Aug 17, 2021 at 5:40 PM Greg Wooledge  wrote:

> I'm still wondering what issue the OP is actually seeing.  If they claim
> that changing ${GCC+-stuff} to ${GCC:+-stuff} in some file fixes things,
> and if they also claim that *something* is setting the GCC variable to
> the empty string rather than unsetting it, then perhaps it's not a
> syntactic error at all, but rather a logical error.
>

I have this in the configure script:

if test $ac_compiler_gnu = yes; then
  GCC=yes
else
  GCC=
fi

I don't know which part of autoconf exactly generates it. They also
mentioned it in this message:
https://lists.gnu.org/archive/html/bug-readline/2021-08/msg2.html


> Either the thing that's setting GCC to the empty string ought to be
> changed to unset GCC, or else the ${GCC+-stuff} check ought to be changed
> to ${GCC:+-stuff} (at the risk of breaking Bourne shell compatibility,
> if we care about that).
>

As far as I can see, that's actually already fixed in the devel branch:
https://git.savannah.gnu.org/cgit/bash.git/tree/configure.ac?h=devel#n417
and
https://git.savannah.gnu.org/cgit/bash.git/tree/CWRU/CWRU.chlog?h=devel#n1503


Re: RFE: new option affecting * expansion

2021-08-17 Thread Ilkka Virta
On Tue, Aug 17, 2021 at 5:36 AM Dale R. Worley  wrote:

>cat $( glob * )
>
> where glob would get one argument, "*", and output a list of file
> names.  A glob-by-modification-date program would be a better solution
> for this need, IMHO.
>

So that program would have to duplicate the globbing code, would need to be
kept in sync if the shell gained new globbing features, and would still end
up
using different globbing rules than the shell, since it couldn't know if
e.g. dotglob
or nocaseglob were set.

Even if you made it so that it didn't take a glob, but just a list of
filenames to
sort, you'd still need to run it as something like this to support arbitrary
filenames:

  mapfile -t -d '' files < <(sort-by-mtime -0 *)

As opposed to using a simple $(...) as you did above, or *(om) as in zsh,
that is. Well, the upside is it wouldn't require changes to the shell, but
doesn't
really look too handy to use.


Re: An alias named `done` breaks for loops

2021-08-16 Thread Ilkka Virta
On Sun, Aug 15, 2021 at 2:00 AM George Nachman  wrote:

> Defining an alias named `done` breaks parsing a for loop that does not have
> an `in word` clause.
>

alias done=""
>

Works for me:

$ set -- a b c
$ alias done='echo hi; done'
$ for x do done
hi
hi
hi

Not that I think it's a good idea to use aliases to mess with the syntax,
or that I
could see what use that empty alias could possibly have.

The way aliases work is kinda unclean to begin with, so if you want to avoid
shooting yourself in the foot with them, then don't use them. For functions,
this seems already impossible:

$ done() { echo 'hello?'; }
bash: syntax error near unexpected token `('
$ \done() { echo 'hello?'; }
bash: `\done': not a valid identifier


Re: Word splitting for $@ in variable assignment

2021-06-25 Thread Ilkka Virta
On Thu, Jun 24, 2021 at 5:20 PM Chet Ramey  wrote:

> On 6/24/21 4:09 AM, Ilkka Virta wrote:
>
> > But 3.4 Shell Parameters is a bit confusing: "Word splitting is not
> > performed, with the exception of "$@" as explained below."
>
> This means that "$@" expands to multiple words, even though double quotes
> would usually inhibit that.
>

Like Nora already mentioned, that quote is from the paragraph about scalar
variable assignment, where it doesn't appear to
produce more than one word, but instead behaves exactly as described in
Special Parameters for the case where word-
splitting does _not_ happen.

As far as I can tell, the behaviour is the same as when $@ is used in the
tested word or one of the patterns in a 'case'
statement or inside [[. The description for 'case' omits any mention of
word splitting, and the description for [[ explicitly
mentions it's not done, but neither/none of those mention any exceptions.

As an aside, the description of $* could perhaps also be changed to also
mention those non-word-splitting contexts, and not
only quoted and unquoted contexts, since in all the non-splitting cases
mentioned above, $* seems to expand to just
a single word even if it's not quoted. (Using the first character of IFS as
joiner, of course).


Re: Word splitting for $@ in variable assignment

2021-06-24 Thread Ilkka Virta
On Thu, Jun 24, 2021 at 10:53 AM Alvin Seville 
wrote:

>  Hello! I want to understand why the following code doesn't produce any
> error:
> set -- "a b" "c"
> a="$@"
> ? I expected smth like: main.sh: line 2: b: command not found due to word
> splitting according to documentation
> <
> https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Shell-Parameters
> >.
> What am I missing?
>

It's a bit oddly put in the manual. It doesn't actually get word-split as
such, it just gives the individual positional parameters joined with spaces
to a single string.

3.4.2 Special Parameters explains that: "$@: In contexts where word
splitting is not performed, this expands to a single word with each
positional parameter separated by a space."

But 3.4 Shell Parameters is a bit confusing: "Word splitting is not
performed, with the exception of "$@" as explained below." I'm not sure if
this is meant to refer to that sentence 3.4.2, but it's a bit confusing
since that one refers to cases without word splitting. It would probably be
clearer to not call it an exception and just refer to Special Parameters
for how $@ behaves when not word-split.

Anyway, it may be better to use a="$*" as it's treated more consistently
between different shells. a="$@" uses the first character of IFS as the
separator in some shells (Zsh, Busybox, Debian/Ubuntu's Dash), and a space
always in others (Bash, Ksh), while a="$*" consistently uses the first
character of IFS. With the default IFS, the result is of course the same.


Re: Prefer non-gender specific pronouns

2021-06-06 Thread Ilkka Virta
On Sun, Jun 6, 2021 at 2:49 PM Léa Gris  wrote:

> Either you're acting in bad faith, or you're so confused by your
> gender-neutral delusion that you don't remember that in normal people's
> grammar, "they" is a plural pronoun.
>

Argh, no, that's just an example of the fact that I can't read. Sorry.

I do agree, it would be useful if English did have separate singular
pronouns, both for
"you" and "they". But since it doesn't, we have to work with what we have.
For the second
person, there's of course "thou", but for some reason, I've never heard
anyone suggest using
that in practice.

I do wonder, though, what the gender-neutral delusion here would be? That
there exist women
who use computers and Unix-like systems, and not just men? Even I know, in
real life, some
female Linux users, and while I haven't asked about shells, I expect they
might use Bash at
least to some extent. So I don't think it's unrealistic to accept the fact
that not all users of Bash
might be "he". That may of course have been different 30 years ago, but
then, the times do
change.


Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-06 Thread Ilkka Virta
On Sun, Jun 6, 2021 at 1:31 PM Ilkka Virta  wrote:

> Personally, I'd just want an option to always make . and .. hidden from
> globs. Or rather,
> to never generate . or .. as a pathname component via globbing. But
> without affecting
> other behaviour, like dotglob, and without precluding the use of . or ..
> as static parts of the
> path.
>

Hmm, looking at the code, this already seems to exist, in lib/glob/glob.c:

   /* Global variable controlling whether globbing ever returns . or ..
 regardless of the pattern. If set to 1, no glob pattern will ever
 match `.' or `..'. Disabled by default. */
  int glob_always_skip_dot_and_dotdot = 1;

I didn't read all the code, but as far as I tested from the git version,
that seems to do what I just
wanted and seems sensible to me with Nora's examples too. (I changed the
filenames from the
previous since I started copying their tests now.)

$ touch .foo .doo bar quux

With dotglob (the first is the same as just *):

$ shopt -s dotglob
$ echo @(.foo|*)
bar .doo .foo quux
$ echo !(.foo)
bar .doo quux
$ echo @(bar|.*)
bar .doo .foo

Without it:

$ shopt -u dotglob
$ echo @(.foo|*)
bar .foo quux
$ echo @(bar|.*)
bar .doo .foo

No match for . and .. even explicitly (with failglob here):

$ echo @(.|..)
bash: no match: @(.|..)

All files with dotglob unset:

$ echo @(.|)*
bar .doo .foo quux

Maybe I missed some more obscure case, though.


Re: !(.pattern) can match . and .. if dotglob is enabled

2021-06-06 Thread Ilkka Virta
> Can you write a set of rules that encapsulates what you would like to see?
> Or can the group?
>

I think it's a bit weird that !(.foo) can match . and .. when * doesn't.

The other means roughly "anything here", and the other means "anything but
.foo here",
so having the latter match things the former doesn't is surprising.

Personally, I'd just want an option to always make . and .. hidden from
globs. Or rather,
to never generate . or .. as a pathname component via globbing. But without
affecting
other behaviour, like dotglob, and without precluding the use of . or .. as
static parts of the
path.

As in:
$ touch .dot normal
$ echo .*
.dot
$ echo ./.*
./.dot

And depending on dotglob,  echo *  should give either  .dot normal  or
just  normal .

So, somewhat similarly to how globbing hides pathname components starting
with a
dot when dotglob is unset, just with another option to hide . and .. in
particular.

Frankly, I don't care if that would also mean that ./@(.|..)/ would match
nothing. I don't
see much use for globbing . and .. in any situation, the dangers of
accidentally climbing up
one level in the tree by a stray .* are much worse. Someone else might
disagree, of course,
but if one really wants to include those two, brace expansion should work
since the two
names are always known to exist anyway. And of course if it's an option,
one doesn't need
to use it if they don't like it.

For what it's worth, Zsh, mksh and fish seem to always hide . and .. , and
at least Zsh does
that even with (.|..) or @(.|..) .


I tried to achieve that via GLOBIGNORE=.:.. , but that has the problem that
it forces dotglob
on, and looks at the whole resulting path, so ./.* still gives ./. and ./..
. Unless you use
GLOBIGNORE=.:..:*/.:*/.. etc., but repeating the same for all different
path lengths gets a bit
awkward.


Re: Prefer non-gender specific pronouns

2021-06-06 Thread Ilkka Virta
On Sun, Jun 6, 2021 at 5:50 AM Léa Gris  wrote:

> Le 05/06/2021 à 18:47, John Passaro écrivait :
> > I can see a couple reasons why it would be a good thing, and in the con
> > column only "I personally don't have time to go through the manual and
> make
> > these changes". but I'd happily upvote a patch from somebody that does.
>
> I can see so many reasons why it would be a bad thing to let the cancel
> culture adepts slip in here, rewriting bash documentations with their
> custom grammar.
>

Using 'they' for a generic, indefinite person, like the user here, one who
could be
anyone, is totally normal use of the language, and not even a very new
invention.

It's not the same as calling a definite person of known gender 'they'.

In fact, that generic 'they' is so common and accepted, that you just used
it yourself
in the part I quoted above.


Brace expansion ordering vs. parameter expansion

2021-04-29 Thread Ilkka Virta
On Thu, Apr 29, 2021 at 4:18 AM Chet Ramey  wrote:

> Maybe, but it's never worked that way and was never intended to. You can
> get what you need using eval:
>
> eval echo \{1..${i}}
>

BTW, was there some background to why they're ordered like this? I'm not
sure if I have heard the story, and didn't see anything about it in Greg's
wiki or bash-hackers.org (of course they tell the "what", but not the
"why"). I didn't dig through all the mailing lists, though.

The versions of ksh I have seem to do braces after parameter expansions
(even interpreting unquoted braces that come from expansions), so

$ ksh -c 'a=3 b=5; echo {$a..$b}'
3 4 5
$ ksh -c 'brace="{3..5}"; echo $brace'
3 4 5

The braces-then-variables order also comes up somewhat often on
unix.stackexchange, with people trying the {$a..$b} and being baffled it
doesn't work. The Bash behaviour allows doing $v{a,b} to expand $va and $vb
instead, but that doesn't seem too useful, can't be used to expand the
variables quoted, and would probably be more of a use case for associative
arrays anyway.

In the simple case above, using 'eval' of course works, but it starts
getting problematic if there's a more complex command line, with variables
that should _not_ run through all expansions again. E.g.

somecommand "$foo" {1..$i}

would require something like

eval somecommand '"$foo"' \{1..$i}

with extra quotes or backslashes added to any expansions and quotes on the
command line.

For a loop over some range of numerical values there is always for (( ...
)), but for x in {$i..$j} would be shorter and would work with something
like for x in foo{$i..$j} without an extra step of gluing the strings
together inside the loop.

Of course there's other ways with subshells, temporary arrays and using
e.g. seq (but I'm not sure that exists on all systems either).


Re: Document variable names need to be all ASCII

2021-04-19 Thread Ilkka Virta
What a 'name' is, is further defined under "Definitions": "name: A word
consisting solely of letters, numbers, and underscores, ..."

But it seems you're right that it doesn't say the locale's idea of letters
isn't taken into account. Some other shells do accept those.

On Mon, Apr 19, 2021 at 4:28 PM 積丹尼 Dan Jacobson 
wrote:

> $ e哈=1
> bash: e哈=1: command not found
> OK, but on man bash is it ever mentioned that a variable name must be all
> ASCII?
>
> ENVIRONMENT
>When a program is invoked it is given an array of  strings  called
> the
>environment.   This  is  a  list  of  name-value  pairs,  of  the
> form
>name=value
>
> PARAMETERS
>A parameter is an entity that stores values.  It can be a name, a
> num‐
>ber, or one of the special characters listed below under Special
> Param‐
>eters.  A variable is a parameter denoted by a name.  A variable
> has ...
>
>


Re: Changing the way bash expands associative array subscripts

2021-04-06 Thread Ilkka Virta
On Tue, Apr 6, 2021 at 6:53 PM Greg Wooledge  wrote:

> In that case, I have no qualms about proposing that unset 'a[@]' and
> unset 'a[*]' be changed to remove only the array element whose key is
> '@' or '*', respectively, and screw backward compatibility.
>

That also seems to be what Ksh and Zsh do.

$ zsh -c 'k=@; typeset -A a=("$k" at foo 123); typeset -p a; unset "a[$k]";
typeset -p a;'
typeset -A a=( @ at foo 123 )
typeset -A a=( foo 123 )
$ ksh -c 'k=@; typeset -A a=([$k]=at [foo]=123); typeset -p a; unset a[$k];
typeset -p a;'
typeset -A a=([@]=at [foo]=123)
typeset -A a=([foo]=123)

Both also have issues with unset a[$k] when k="x] b[y", btw.

What konsolebox said about a[$k]=() works in my Zsh for indexed arrays, but
not associative ones.
(It replaces an array slice, so can also be used to insert elements in the
middle.)


Re: Changing the way bash expands associative array subscripts

2021-04-06 Thread Ilkka Virta
On Tue, Apr 6, 2021 at 6:13 PM Greg Wooledge  wrote:

> As a counter-proposal, Chet could entirely remove the special meaning
> of unset 'a[@]' and introduce a new option to unset which would take
> its place.  It appears -a is not yet used, so that would be a good pick.
>

Unless I missed something, doesn't just  unset a  do the same:

$ declare -A a=([foo]=123 [bar]=456)
$ unset a
$ declare -p a
bash: declare: a: not found

$ declare -A a=([foo]=123 [bar]=456)
$ unset 'a[@]'
$ declare -p a
bash: declare: a: not found

 i.e. both remove the whole array, not just the contents.


Re: select syntax violates the POLA

2021-04-02 Thread Ilkka Virta
On Thu, Apr 1, 2021 at 7:59 PM Robert Elz  wrote:

> Alternatively
> d=( $( ls -d /usr/src/pkg/*/$1 ) )
> or just
> d=( $( printf %s\\n /usr/src/pkg/*/$1 ) )
>
> Just to be sure.Personally I'd do
>
> set -- /usr/src/pkg/*/$1
>

Just the glob is fine in the array assignment, it splits and globs the same
as in arguments to  'set':

d=( /usr/src/pkg/*/$1 )

(If there was any context that splits but doesn't glob, this isn't one)


Re: select syntax violates the POLA

2021-04-02 Thread Ilkka Virta
On Fri, Apr 2, 2021 at 2:04 AM Robert Elz  wrote:

> chet.ra...@case.edu said:
>   | Yes, you need a list terminator so that `done' is recognized as a
> reserved
>   | word here. `;' is sufficient. Select doesn't allow the `done' unless
> it's
>   | in a command position.
>
> isn't really all that appealing as an explanation.   select isn't part
> of the standard, so its syntax is arbitrary, which means that nothing can
> really be considered wrong, but while we often think of reserved words
> (not counting the special cases in case and for statements) as only working
> in the command word position, that's not how it really is.  They work
> there,
> they also should work following other reserved words (most of them, but
> '}' is not one of the exceptions).   so '} done' should work correctly,
> always, if the '}' is a reserved word, and a ';' or newline between them
> should not be needed.
>

FWIW, it works in the other shells I know that support select:

 $ cat select.sh
select x in foo bar; do {
echo $x;
break;
} done;

$ for sh in bash ksh mksh zsh; do echo "== $sh"; $sh select.sh <<< 1; done
== bash
select.sh: line 5: syntax error near unexpected token ‘done’
select.sh: line 5: `} done;'
== ksh
1) foo
2) bar
foo
== mksh
1) foo
2) bar
#? foo
== zsh
1) foo  2) bar
?# foo


Re: zsh style associative array assignment bug

2021-03-30 Thread Ilkka Virta
On Tue, Mar 30, 2021 at 1:40 AM Eric Cook  wrote:

> Its just when populating that array dynamically with another array
> if that second array didn't contain `v1' hypothetically, the array gets
> shifted to
>
> a=( [k1]=k2 [v2]=k3 [v3]= )
> which i would imagine to be unexpected for the author of the code and
> would rather
> it error out instead of chugging along.
>

Just checking the parity can never help if there's a risk of values missing
from the middle of the list.
What if there's two values missing? You could be left with (k1 v1 k2 k3) or
(k1 v1 k2 v3).


Re: zsh style associative array assignment bug

2021-03-29 Thread Ilkka Virta
On Mon, Mar 29, 2021 at 1:56 AM Greg Wooledge  wrote:

> Python is different:
>
> >>> y = ["a", "b", "c", "d"]
> >>> dict(zip(y[::2], y[1::2]))
> {'a': 'b', 'c': 'd'}
> >>> x = ["a", "b", "c"]
> >>> dict(zip(x[::2], x[1::2]))
> {'a': 'b'}
>
> It seems to discard the last (unmatched) value.  Also, dear gods, what
> is that horrible syntax they forced me to google for... and it took MANY
> tries to find it, too.
>

That's a bit different, since it's not really a built-in feature but doing
it manually,
just using some rather terse syntax and tersely named functions. Apart from
Python
having such stuff available, it's not really unlike just filling an
associative array from
an indexed one with a loop in the shell.  And if you do it manually, you
can do
anything you like. You can't do it directly in Python the same way as in
Zsh, Bash or
Perl, though.

[ more Python follows, ignore if you like ]

Just dropping a list to dict() would probably be the most direct
equivalent, but it
doesn't work:

>>> dict([ "a", "b", "c", "d" ])
Traceback (most recent call last):
  File "", line 1, in 
ValueError: dictionary update sequence element #0 has length 1; 2 is
required

You have to pair the elements up (which is what zip() does there):

>>> dict([ ("a", "b"), ("c", "d") ])
{'a': 'b', 'c': 'd'}

And you can't leave unpaired elements (or sets of three):

>>> dict([ ("a", "b"), ("c",) ])
Traceback (most recent call last):
  File "", line 1, in 
ValueError: dictionary update sequence element #1 has length 1; 2 is
required


Using 'itertools.zip_longest' like Andreas showed would be another
alternative for
doing it manually. It would generate explicit 'None's to fill the shorter
list, which then
would appear in the result. Of course, that's possible because Python has
'None' to
begin with, like Perl has 'undef'. The shell doesn't really have the
equivalent of that.
IMO, an unset value would be the nearest equivalent (vs. an empty string
being an
explicitly defined value), but I guess that's arguable.



I'm not sure what you meant as the horrible syntax, and you probably don't
want to
know this, :) but the index in x[1::2] takes every other element of 'x',
starting at 1.
Somewhat similarly to a brace expansion like {1..9..2}. The zip() there
pairs the
even and odd halves together.


Re: zsh style associative array assignment bug

2021-03-28 Thread Ilkka Virta
On Sun, Mar 28, 2021 at 10:16 AM Oğuz  wrote:

> That an idea was borrowed from another shell doesn't mean it should be
> implemented the same kludgy way. Besides, bash doesn't offer compatibility
> with zsh.
>

You don't think the realm of POSIX-ish shells already has enough
incompatibilities
and minor quirks between the different shells? :)

The oldest version of Bash I ever used is 4.4.20, and if my memory serves
> me right, it didn't regard omission of a value inside a compound assignment
> (like `foo=([bar]=)`) an error. If otherwise and it actually did, that was
> a mistake.
>

But this is an error:

   a=([foo]=123 [bar])

Now, the syntax is different so it's not a fair comparison, really. But
spinning up an
empty word where none exists is not something the shell usually does
anywhere
else, so why should it do that here?

I believe bash will eventually (if hasn't already done in devel) allow `$@'
> and the like to expand to multiple words inside a compound assignment to an
> associative array. Being forced to trim an array to an even number of
> elements before using it inside such an assignment would be really
> annoying, in my opinion.
>

 I wonder, what would the use case be? I could understand assigning the
words
from "$@" to the keys of an associative array might be useful, but where
would
you want to fill the keys and values, while at the same time silently
allowing a
missing value? Or silently dropping one. Shouldn't a script treat that as
an error
and have the user recheck what they're doing, the same as in any case where
a tool gets too many or too few arguments?


Re: Wanted: quoted param expansion that expands to nothing if no params

2021-03-24 Thread Ilkka Virta
On Wed, Mar 24, 2021 at 9:38 PM L A Walsh  wrote:

> Hmmm...Now that I try to show an example, I'm not getting
> the same results.  Grrr.  Darn Heizenbugs.
>

Just remember that if you test with printf, it always prints at least once,
which makes it look exactly as if it got an empty string argument, even if
there are none:

$ set --
$ printf ":%s:\n" "$@"
::
$ set -- ; printf ":%s:\n" x "$@"
:x:


ignoreeof variable (lowercase) as a synonym for IGNOREEOF

2021-03-22 Thread Ilkka Virta
The lowercase 'ignoreeof' variable appears to act as a sort of a synonym to
the uppercase 'IGNOREEOF'. Both seem to call into 'sv_ignoreeof', and the
latter one set takes effect. I can't see the lowercase one documented
anywhere, is this on purpose?


Re: is it a bug that \e's dont get escaped in declare -p output

2021-03-17 Thread Ilkka Virta
On Wed, Mar 17, 2021 at 8:26 PM Greg Wooledge  wrote:

> I thought, for a moment, that bash already used $'...' quoting for
> newlines, but it turns out that's false.  At least for declare -p.
> It would be nice if it did, though.  Newlines, carriage returns, escape
> characters, etc.
>

It does in some cases:

 $ a=($'new \n line' $'and \e esc'); declare -p a
declare -a a=([0]=$'new \n line' [1]=$'and \E esc')


Re: echo $'\0' >a does not write the nul byte

2021-01-17 Thread Ilkka Virta
On Mon, Jan 18, 2021 at 12:02 AM Martin Schulte 
wrote:

> To be exact, this is already caused by a Unix Kernel - strings passed by
> the exex* system calls are null terminated, too. As it is the default in
> the C programming language. Thus you can't pass a null byte in an argument
> when invoking a program.
>

Bash's echo is a builtin, so using it doesn't involve an execve(). Most
shells still don't allow passing NULs
to builtins. Zsh is the exception, it does allow NUL bytes in internal
variables and in arguments to builtins.
But, that's Zsh, and of course even it can't pass the NUL as an argument to
an external program, exactly
because of the execve() interface.


Re: V+=1 doesn't work if V is a reference to an integer array element

2021-01-13 Thread Ilkka Virta
>
> Lots of things "could be useful" if only they worked, but they don't work,
> so you don't do them.
>

Yes, and it does work in 4.4.


> The fact that this undocumented, unsupported hack "appeared to work"
> with certain inputs in bash 4.4 is just an unfortunate event, unless
> Chet decides to make it an actual supported feature.
>

https://www.gnu.org/software/bash/manual/html_node/Shell-Parameters.html ,
bottom chapter, in the middle:

   However, nameref variables can reference array variables and subscripted
array variables.


Re: V+=1 doesn't work if V is a reference to an integer array element

2021-01-13 Thread Ilkka Virta
On Wed, Jan 13, 2021 at 7:49 PM Greg Wooledge  wrote:

> On Wed, Jan 13, 2021 at 07:00:42PM +0200, Oğuz wrote:
> > $ declare -n b=a[0]
>
> I can't see any documentation that supports the idea that this should
> be allowed in the first place.
>

It's arguably useful though, and works in 4.4 (with -i or not):

$ ./bash4.4.12 -c 'declare -ai a=1; declare -n b="a[0]"; b+=1; echo $b;
b=123; echo $b; declare -p a b'
2
123
declare -ai a=([0]="123")
declare -n b="a[0]"

Also a simple assignment seems to work, just the += case fails:

$ ./bash5.0 -c 'declare -ai a=(11 22 33); declare -n b="a[1]"; b=123; echo
$b; declare -p a'
123
declare -ai a=([0]="11" [1]="123" [2]="33")

Then again, using a nameref like that in an arithmetic context doesn't seem
to work (in either 4.4 or 5.0) but
silently gives a zero (which is what you get if b pointed to an unset
variable, but here I can't see what the
variable it points to would be):

$ ./bash4.4.12 -c 'declare -a a=(11 22 33); declare -n b="a[1]"; echo
$((b))'
0


Re: Arithmetic pow results incorrect in arithmetic expansion.

2021-01-09 Thread Ilkka Virta
On Sat, Jan 9, 2021 at 9:14 AM Robert Elz  wrote:

> Not "all" other shells, most don't implement exponentiation at all,
> since it isn't a standard C operator.
>

Which also means that the statement about the operators being "same as in
the C language" doesn't really help figure out how this particular one
works.
Which is probably why Zsh has explicitly mentioned it.


Re: Arithmetic pow results incorrect in arithmetic expansion.

2021-01-09 Thread Ilkka Virta
On Sat, Jan 9, 2021 at 7:22 AM Oğuz  wrote:

> 9 Ocak 2021 Cumartesi tarihinde Hyunho Cho  yazdı:
> > $ echo $(( -2 ** 2 )) # only bash results in 4
> > 4
> `bc' does that too. Here's another trivial side effect of implementing
> unary minus using binary minus:
>

Note that binary minus doesn't really compare here. It has a lower
precedence, so gives a different result:

$ echo $(( -3 ** 2 ))
9
$ echo $(( 0 - 3 ** 2 ))
-9


Re: Associative array keys are not reusable in (( command

2021-01-08 Thread Ilkka Virta
On Fri, Jan 8, 2021 at 5:44 PM Chet Ramey  wrote:

> On 1/8/21 10:24 AM, Oğuz wrote:
> > This situation is why bash-5.0 introduced the `assoc_expand_once'
> option.
> > But it allows arbitrary command injection.
>
> If you want to run array keys through word expansions, this is one
> potential result. Command substitution is "arbitrary command injection."
>

Let's say I don't want to run the keys through expansions. How
does assoc_expand_once help then, if it doesn't stop the expansion inside
$key?

With the keys in the previous messages, what _seems_ to work is

  shopt -u assoc_expand_once
  (( assoc[\$k]++ ))

But it only works with assoc_expand_once disabled, and somehow, I'm not
sure if it's safe for more complex keys.

then there's also

x=${assoc[$k]}
assoc[$k]=$(( x + 1 ))

which doesn't seem to be affected by assoc_expand_once but is a bit weary
to use.


Re: declare -p name=value thinks 'name=value' is variable

2021-01-08 Thread Ilkka Virta
On Fri, Jan 8, 2021 at 4:06 PM Chet Ramey  wrote:

> No. `declare -p' does not accept assignment statements.
>

The synopsis in the documentation doesn't make it clear, though. It shows
only one case with -p and assignment, while the similar case of export -p
is listed separately from the other forms of export. Well, at least in my
5.0 man page, not on the online manual, though. Also, the documentation
doesn't seem to say the assigned value is ignored with -p, even though it
does say additional options are ignored.

So, suggest changing the synopsis from

  declare [-aAfFgiIlnrtux] [-p] [name[=value] …]

to something like:

  declare [-aAfFgiIlnrtux] [name[=value] …]
  declare [-aAfFgiIlnrtux] -p [name…]

And/or add to:

  "The -p option will display the attributes [...] additional options,
other than -f and -F, are ignored."

something like

  "An assignment cannot be used with the -p option."

Also, the error message should probably be "invalid variable name" like you
get for referring to something like that via ${!var}, instead of "not
found".

Since I mentioned export, export -p foo doesn't seem to do anything, even
if foo is exported, not even give an error. But, contrary to declare,
export -p foo=bar
_does_ assign the value, silently. Confusing.


Re: Checking executability for asynchronous commands

2020-12-28 Thread Ilkka Virta
On Mon, Dec 28, 2020 at 3:16 PM Greg Wooledge  wrote:

> The problem is that the parent bash (the script) doesn't know, and
> cannot know, that the command was stillborn.  Only the child bash
> process can know this, and by the time this information has become
> available, the parent bash process has already moved on.
>

In principle, if the parent and child were to cooperate, I think the status
of the final execve()
could be communicated to the parent like this: Set up a pipe between the
parent and the child,
with the write side set to close-on-exec, and have the parent block on the
read side. If the
execve() call fails, the child can send an error message via the pipe, and
if it succeeds, the
parent will see the pipe being closed without a message.

Polling the child after some fraction of a second might not be able to tell
a failed execve()
apart from the exec'ed process exiting after the exec.


Re: while loops can not read "\"

2020-12-23 Thread Ilkka Virta
Regardless, my point is that "as with word splitting" is not exactly true,
and misleading statements
have no place in documentation if it's meant to be of any use to a user who
needs the documentation
to begin with.

In any case, adding "A backslash can be used to escape a delimiter or a
newline." shouldn't make
the text significantly heavier, given it already uses almost double the
letters to spell out how the words
are assigned to the names given in order.


On Wed, Dec 23, 2020 at 4:44 PM Chet Ramey  wrote:

> On 12/22/20 9:13 AM, Ilkka Virta wrote:
> > Arguably it's a bug that 'help read' doesn't mention the effect of
> > backslashes, other than what can be extrapolated from the description of
> > -r. It only says "The line is split into fields _as with word
> splitting_",
> > but word splitting doesn't recognize backslashes as special. It should
> not
> > be necessary to read the description of all options to infer the
> behaviour
> > of a command as used without them.
>
> That isn't the function of `help'. The help documentation is intended to be
> a quick reference, not to duplicate everything in the man page.
>
> --
> ``The lyf so short, the craft so long to lerne.'' - Chaucer
>  ``Ars longa, vita brevis'' - Hippocrates
> Chet Ramey, UTech, CWRUc...@case.eduhttp://tiswww.cwru.edu/~chet/
>


Re: while loops can not read "\"

2020-12-22 Thread Ilkka Virta
Arguably it's a bug that 'help read' doesn't mention the effect of
backslashes, other than what can be extrapolated from the description of
-r. It only says "The line is split into fields _as with word splitting_",
but word splitting doesn't recognize backslashes as special. It should not
be necessary to read the description of all options to infer the behaviour
of a command as used without them.

The online reference is clearer on this:
https://www.gnu.org/software/bash/manual/html_node/Bash-Builtins.html#index-read

On Tue, Dec 22, 2020 at 4:05 PM  wrote:

> On 22/12/2020 08:18, ffvh gfff wrote:
> > Machine: x86_64
> > OS: linux-gnu
> > Compiler: gcc
> > Compilation CFLAGS: -g -O2 -fstack-protector-strong -Wformat
> > -Werror=format-security -Wall
> > uname output: Linux kali 5.7.0-kali1-amd64 #1 SMP Debian 5.7.6-1kali2
> > (2020-07-01) x86_64 GNU/Linux
> > Machine Type: x86_64-pc-linux-gnu
> >
> > Bash Version: 5.1
> > Patch Level: 0
> > Release Status: release
> >
> > command line:
> > cat poc.txt | while read i; do echo $i;do
> In fact, the read builtin can read backslashes. It's just that you
> didn't escape them. Use the -r option.
>
> $ help read | grep -- -r
>-r   do not allow backslashes to escape any characters
>
> $ echo '\' | { read i; echo "$i"; }
>
> $ echo '\' | { read -r i; echo "$i"; }
> \
>
> --
> Kerin Millar
>
>


Re: increment & decrement error when variable is 0

2020-11-24 Thread Ilkka Virta
On Tue, Nov 24, 2020 at 12:07 AM Jetzer, Bill 
wrote:

> ((--x)) || echo "err code $? on --x going
> from $i to $x";
>
> err code 1 on ++x going from -1 to 0
>

That's not about --x, but of the ((...)) construct:

"" (( expression ))
The arithmetic expression is evaluated according to the rules described
below (see Shell Arithmetic). If the value of the expression is non-zero,
the return status is 0; otherwise the return status is 1. This is exactly
equivalent to let "expression" See Bash Builtins, for a full description of
the let builtin.""

https://www.gnu.org/software/bash/manual/html_node/Conditional-Constructs.html#Conditional-Constructs
Note the second sentence and try e.g.:

$ if (( 100-100 )); then echo true; else echo false; fi
false

That also matches how truth values work in e.g. C, where you could write if
(foo) { ... } to test if foo is nonzero.
Also, consider the return values of the comparison operators. The behaviour
here makes it possible to implement
them by just returning a number, instead of having to deal with a distinct
boolean type that would affect the exit status:

$ (( a = 0 < 1 )); echo $a
1

> err code 1 on x++ going from 0 to 1

As to why you get this here, when going to one instead of zero, remember
the post-increment returns the original value.


Re: use of set -e inside parenthesis and conditionnal second command

2020-11-17 Thread Ilkka Virta
On Tue, Nov 17, 2020 at 4:07 PM Pierre Colombier via Bug reports for the
GNU Bourne Again SHell  wrote:

> #2
> pierre@zebulon: ~ $ (set -e ; echo A ; false ; echo B ) && echo C.$?
> #3
> pierre@zebulon: ~ $ bash -c 'set -e ; echo A ; false ; echo B ' && echo
> C.$?



 If it's not a bug, I think the manual should explain the difference
> between #2 and #3 in section 3.2.3 and 3.2.4.3
>

Technically, I suppose the description of set -e already says that (4.3.1
The Set Builtin):

"If a compound command or shell function executes in a context where -e is
being ignored,
none of the commands executed within the compound command or function body
will be affected
by the -e setting, even if -e is set and a command returns a failure
status. "

The subshell (set -e; echo...) is a compound command, and it executes as a
non-final part of the && list,
so none of the commands within are affected by set -e. The other command
explicitly invoking bash is not
a subshell or any other type of compound command.

Now, perhaps that could use a note explicitly saying this also means
subshells, even though they may
have set -e in effect independently of the main shell.

The part explaining subshells (3.7.3 Command Execution Environment) could
perhaps also use a mention
of that caveat, since it does mention set -e already, though in the context
of command substitution.

https://www.gnu.org/software/bash/manual/html_node/The-Set-Builtin.html
https://www.gnu.org/software/bash/manual/html_node/Command-Execution-Environment.html


Re: execve E2BIG (Argument list too long)

2020-09-30 Thread Ilkka Virta
On Wed, Sep 30, 2020 at 5:53 PM Michael Green  wrote:

> The included short script when run with the following command results
> in execve "E2BIG (Argument list too long) errors".
>
> * The number of arguments can be tuned down to "seq 1 23694" and it
> still occurs, but any lower and it disappears.
>

That sounds a lot like the 128 kB hard limit Linux puts on the size of a
single argument to exec. (I know it counts for command line arguments, I
expect it also counts for env variables. They're passed in pretty much the
same way anyway.)

seq 1 23694 | wc  gives 131058, just a bit less than 131072. Add the
variable name and it goes over. Workaround: use another OS, or pass big
data like that in files.


Re: Bash parameter expansion (remove largest trailing match, remove largest leading match, pattern replacement) does not work

2020-08-30 Thread Ilkka Virta
On Sat, Aug 29, 2020 at 11:13 PM Bruce Lilly  wrote:

> It's a bit more complicated than that; if, for example, some excerpt ended
> up in regression tests, there would be a question about whether or not
> there was a copyright violation.  As I understand the GPL (IANAL), it
> requires all parts of a "work" to be GPL'd, and that wouldn't be possible
> for any parts of the script that ended up in bash regression tests.
>

That's hilarious. People post proof-of-concept scripts and code snippets as
part of bug
reports and such every day. If you'd just reduced the problem to a simple
demonstration
(below), you could have explicitly licensed it under the GPL if you were
afraid someone might
want to include it to a GPL'd software. In any case, for a one-liner like
this, it might not even
be copyrightable (at least not everywhere) as pretty much lacks any
creativity. I'd also assume
that test scripts often aren't even compiled with the main program, just
aggregated to the
code distribution. Anyway, it wouldn't be you doing the copyright violation
if someone used
your snippet without license.

Bash and ksh indeed differ in this:

 $ bash -c 'str=foo/; sep="\057"; printf %s\\n ${str%%$sep}'
 foo/

 $ ksh -c 'str=foo/; sep="\057"; printf %s\\n ${str%%$sep}'
 foo

And there's nothing in the Bash manuals that says \057 should be taken as
an octal
escape in a pattern match. The workarounds to that are either sep=$'\057'
which is
documented to accept escapes like this, or sep=/ which just works in the
obvious manner.
I do wonder why you'd even bother with trying the octal escape here instead
of just writing
the slash as a slash. Something like \001 would be different, of course.
The fact that $'\057'
does what you seem to want is exactly the part where you might have used a
form of quoting
which would have worked, but there was no way for the reader to check that
because you
hid the code.

$'' is described here:
https://www.gnu.org/software/bash/manual/html_node/ANSI_002dC-Quoting.html
(search for 'octal')

Of course, you also appear to have missed extglob, which I guess is
understandable if you're
coming from ksh. But even so, reducing the problem to smaller, easier to
debug pieces would
have shown the difference there, too, separately from the differences in
handling of octal escapes.
And perhaps led you to read the rest of what the manual says on Pattern
Matching, from the exact
page you linked to. ("If the extglob shell option is enabled using the
shopt builtin, several extended
pattern matching operators are recognized...")


Re: Bash parameter expansion (remove largest trailing match, remove largest leading match, pattern replacement) does not work

2020-08-29 Thread Ilkka Virta
On Sat, Aug 29, 2020 at 9:56 PM Bruce Lilly  wrote:

> Please don't assume that something output by printf (without quoting) for
> clarity is representative of actual expansion by the shell when properly
> quoted.
>

If you don't want people to assume (and you shouldn't, if you want them to
help you),
you'd better post the actual script you use, and not just something output
by printf.
Without that, it's impossible to check what quoting you used, or to
reproduce the issue.


Re: Incrementing variable=0 with arithmetic expansion causes Return code = 1

2020-08-28 Thread Ilkka Virta
On Fri, Aug 28, 2020 at 4:04 PM Gabriel Winkler 
wrote:

> # Causes error
> test=0
> ((test++))
> echo $?
> 1
>

It's not an error, just a falsy exit code. An error would probably give a
message.
But to elaborate on the earlier answers, the value of the post-increment
expression
var++ is the _old_ value of var, even though var itself is incremented as a
side effect.
Use the pre-increment ++var to get the incremented value as the value of
the expression.

The exit status of (( )) is one if the arithmetic expression evaluates to
zero, which is exactly
what happens here.

Similarly, a=0; b=$((a++)) results in a=1, b=0.

On the other hand, a=0; b=$((++a)) results in a=1, b=1, and so does a=0;
b=$((a+=1)).


Re: $(<...) fails with nested quoting.

2020-08-17 Thread Ilkka Virta
On Mon, Aug 17, 2020 at 3:53 PM Steven McBride  wrote:

>  'echo "$(<\"filename\")"' fails with No such file or directory
>

Quotes inside $() are independent from the ones outside. If you escape
them, you get literal quotes as part of the filename.

$ echo hello > '"filename"'
$ echo "$(<\"filename\")"
hello

$ echo another > 'name with spaces'
$ echo "$(<"name with spaces")"
another

I've been led to understand this isn't so with backticks, though.


Re: Expand first before asking the question "Display all xxx possibilities?"

2020-08-06 Thread Ilkka Virta

On 6.8. 15:59, Chet Ramey wrote:

On 8/6/20 8:13 AM, Ilkka Virta wrote:

I think they meant the case where all the files matching the given
beginning have a longer prefix in common. The shell expands that prefix to
the command line after asking to show all possibilities.


Only if you set the "show-all-if-ambiguous" readline variable explicitly
asking for this behavior. Readline's default behavior is to complete up to
the longest common prefix, then, on the next completion attempt, to note
that there weren't any additional changes to the buffer and ask if the user
wants to see the alternatives. Dan wants a change in the behavior that
variable enables.


Right, sorry.

I do have it set because otherwise there's a step where tab-completion 
only produces a beep, and doesn't do anything useful. I didn't realize 
causes partial completion to be skipped too.



--
Ilkka Virta / itvi...@iki.fi



Re: Expand first before asking the question "Display all xxx possibilities?"

2020-08-06 Thread Ilkka Virta

On 5.8. 22:21, Chris Elvidge wrote:

On 05/08/2020 02:55 pm, Chet Ramey wrote:

On 8/2/20 6:55 PM, 積丹尼 Dan Jacobson wrote:

how about doing the expansion first, so entering
$ zz /jidanni_backups/da would then change into

 >> $ zz /jidanni_backups/dan_home_bkp with below it the question
 >> Display all 113 possibilities? (y or n)

What happens if you have:
dan_home-bkp, dan_home_nobkp, dan-home-bkp, dan-nohome-bkp, 
dan_nohome-bkp (etc.) in /jidanni_backups/?

Which do you choose for the first expansion?


I think they meant the case where all the files matching the given 
beginning have a longer prefix in common. The shell expands that prefix 
to the command line after asking to show all possibilities.


 $ rm *
 $ touch dan_home_bkp{1..199}
 $ ls -l da[TAB]
 Display all 199 possibilities? (y or n) [n]
 $ ls -l dan_home_bkp[cursor here]

So the shell has to fill in the common part anyway, and it might as well 
do it first, without asking.


(Which just so happens to be what Zsh does...)


--
Ilkka Virta / itvi...@iki.fi



Re: No word splitting for assignment-like expressions in compound assignment

2020-07-28 Thread Ilkka Virta

On 28.7. 17:22, Chet Ramey wrote:

On 7/23/20 8:11 PM, Alexey Izbyshev wrote:

$ Z='a b'
$ A=(X=$Z)
$ declare -p A
declare -a A=([0]="X=a b")



It's an assignment statement in a context where assignment statements are
accepted (which is what makes it different from `echo X=$Z', for instance),
but the lack of a subscript on the lhs makes it a special case. I'll take a
look at the semantics here.


This is also a bit curious:

 $ b=( [123]={a,b,c}x )
 $ declare -p b
 declare -a b=([0]="[123]=ax" [1]="[123]=bx" [2]="[123]=cx")

It does seem to have a subscript on the LHS, but it didn't work as one.
To be in line with a plain scalar assignment, the braces should probably 
not be expanded here.



--
Ilkka Virta / itvi...@iki.fi



Re: Segfault in Bash

2020-07-14 Thread Ilkka Virta

On 14.7. 16:08, Chet Ramey wrote:

On 7/14/20 6:32 AM, Jeffrey Walton wrote:

./audit-libs.sh: line 17: 22929 Segmentation fault  (core dumped)
$(echo "$file" | grep -E "*.so$")


Bash is reporting that a process exited due to a seg fault, but it is
not necessarily a bash process.


As a suggestion: it might be useful if the error message showed the 
actual command that ran, after expansions. Here it shows the same 
command each time, and if only one of them crashed, you wouldn't 
immediately know which one it was. The un-expanded source line is in any 
case available in the script itself.


The message also seems to be much briefer for an interactive shell or a 
-c script. At least the latter ones might also benefit from the longer 
error message.


--
Ilkka Virta / itvi...@iki.fi



Re: Segfault in Bash

2020-07-14 Thread Ilkka Virta

On 14.7. 13:32, Jeffrey Walton wrote:

Hi Everyone,

I'm working on a script to find all shared objects in a directory. A
filename should match the RE '*.so$'. I thought I would pipe it to
grep:



IFS="" find "$dir" -name '*.so' -print | while read -r file
do
 if ! $(echo "$file" | grep -E "*.so$"); then continue; fi
 echo "library: $file"

done


Are you trying to find the .so files, or run them for some tests? 
Because it looks to me that you're running whatever that command 
substitution outputs, and not all dynamic libraries are made for that.



--
Ilkka Virta / itvi...@iki.fi



Re: foo | tee /dev/stderr | bar # << thanks!

2020-07-06 Thread Ilkka Virta

On 6.7. 14:37, Greg Wooledge wrote:

On Sat, Jul 04, 2020 at 01:42:00PM -0500, bug-b...@trodman.com wrote:

but your soln is simplier.  I assume /dev/stderr is on non linux UNIX
also.


It is *not*.  It is not portable at all.


It works on macOS and I see mentions of it in man pages of the various 
*BSD:s and Solaris, so even if not standard, it's not like it's Linux-only.


--
Ilkka Virta / itvi...@iki.fi



Re: [PATCH 5.1] zread: read files in 4k chunks

2020-06-22 Thread Ilkka Virta

On 22.6. 19.35, Chet Ramey wrote:

On 6/22/20 1:53 AM, Jason A. Donenfeld wrote:

Currently a static sized buffer is used for reading files. At the moment
it is extremely small, making parsing of large files extremely slow.
Increase this to 4k for improved performance.


I bumped it up to 1024 initially for testing.


It always struck me as odd that Bash used such a small read of 128 
bytes. Most of the GNU utils I've looked at on Debian use 8192, and a 
simple test program seems to indicate glibc's stdio reads 4096 bytes at 
one read() call.


--
Ilkka Virta / itvi...@iki.fi



Re: Bug on bash shell - $PWD (and consequentely prompt) not updated while renaming current folder.

2020-06-20 Thread Ilkka Virta

On 20.6. 3.51, corr...@goncalo.pt wrote:

When we rename the current working directory, $PWD doesn't get updated
as it would as it would if we just did a simple "cd directory". 
Fix:

Probably: Trigger the current working directory refresh event, like it
is already done with the cd command. Because we can be renaming our own
current working directory, so a simple trigger is needed when mv is
executed and renaming the current working directory. At the same time,


The directory can get renamed by some completely unrelated background 
process, without any action from the shell, so you'd need to recheck it 
every time the prompt is printed, not just when a particular command, or 
any command, is launched. (The name of the directory could even change 
while the shell is waiting for a command line to be input.)


Running  cd .  should reset PWD to show the new name, and if you need 
that often, I suppose you could run it from PROMPT_COMMAND:


/tmp$ PROMPT_COMMAND='cd .'
/tmp$ mkdir old
/tmp$ cd old
/tmp/old$ mv /tmp/old /tmp/new
/tmp/new$ echo $PWD
/tmp/new


--
Ilkka Virta / itvi...@iki.fi



Could we have GLOBIGNORE ignore . and .. in subdirectories, too?

2020-05-28 Thread Ilkka Virta
Let's say I want to glob just the files with names starting with dots, 
but not the ubiquitous dot and dot-dot entries, which are seldom a 
useful result of a glob.


That can be done with something like   ..?* .[!.]*   or   .!(|.)   with 
extglob. Both are still a bit annoying to type, and it would be nice to 
just have  .*  do this directly. GLOBIGNORE seems like it could help, 
but it appears the automatic hiding of . and .. only works with globs 
without a path element:  .*  doesn't generate them, but   ./.*   does.


I could add  GLOBIGNORE=.:..:*/.:*/..  to catch them also in 
subdirectories, but then of course that doesn't work for 
sub-sub-directories, etc.



Could it be possible to extend GLOBIGNORE or some other option to have
globs like  foo/.*   not generate  . and .. as the final part of the 
path regardless of the level they are in?


When given explicitly, without a glob in the final part of the path, 
they should probably still be allowed, even if the word was otherwise a 
glob. (e.g. if something like  foo/*/.  happened to be useful in some case.)



--
Ilkka Virta / itvi...@iki.fi



Re: Not missing, but very hard to see (was Re: Backslash missing in brace expansion)

2019-12-12 Thread Ilkka Virta

On 12.12. 21:43, L A Walsh wrote:

On 2019/12/06 14:14, Chet Ramey wrote:

Seems very hard to print out that backquote though.  Closest I got
was bash converting it to "''":


The backquote is in [6], and the backslash disappears, you just get the 
pair of quotes in [2] because that's how printf %q outputs an empty string.



 read -r -a a< <(printf "%q " {Z..a})
 my -p a
declare -a a=([0]="Z" [1]="\\[" [2]="''" [3]="\\]" [4]="\\^" [5]="_" 
[6]="\\\`" [7]="a")



--
Ilkka Virta / itvi...@iki.fi



Re: Backslash missing in brace expansion

2019-12-06 Thread Ilkka Virta

On 6.12. 21:36, Eric Blake wrote:

On 12/5/19 10:53 PM, Martin Schulte wrote:


(2019-11-11) x86_64 GNU/Linux $ echo ${BASH_VERSINFO[@]}
4 4 12 1 release x86_64-pc-linux-gnu
$ set -x
$ echo {Z..a}
+ echo Z '[' '' ']' '^' _ '`' a
Z [  ] ^ _ ` a

It looks as if the backslash (between [ and ] in ASCII code) is
missing in brace expansion. The same behaviour seems to be found in
bash 5.0.


It's an unquoted backslash, which is removed by quote removal when the
words are expanded. Look at the extra space between `[' and `]'; that's
the null argument resulting from the unquoted backslash.


Yes - sure. But then I'm wondering why the unquoted backtick doesn't
start command substitution:


It may be version dependent:

$ echo ${BASH_VERSINFO[@]}
5 0 7 1 release x86_64-redhat-linux-gnu

$ echo b{Z..a}d
bash: bad substitution: no closing "`" in `d


I get that with 4.4 and 'echo b{Z..a}d' too, the trailing letter seems 
to trigger it.


Which also doesn't seem to make sense, but one might argue that {Z..a} 
doesn't make much sense in the first place. Seriously, is there an 
actual use case for such a range?


It doesn't seem to even generalize from that if you go beyond letters, 
so you can't do stuff like generating all the printable ASCII characters 
with it in Bash.



--
Ilkka Virta / itvi...@iki.fi



Re: Feature Request: Custom delimeter for single quotes

2019-11-01 Thread Ilkka Virta

On 1.11. 06:54, Patrick Blesi wrote:

I'm looking for a hybrid between single quotes and a here doc or here
string.

The main use case is for accepting arbitrary user-specified text. 


Do your users enter the text by directly editing the script?
Would it make more sense to use e.g. 'read' to read the input directly 
from the user?


input=""
nl='
'
echo "Enter text, end with ^D:"
while IFS= read -r line; do
input="$input$line$nl"
done

printf "You entered:\n---\n%s---\n" "$input"


or to just have the text in a separate file (not the script) and read it 
from there?


input=$(< inputfile)


That way, the text appears in a variable, and you don't need to care 
about quotes inside it.



(You could also read from stdin with just  input=$(cat)  instead of the 
while read loop but that feels a bit odd to me for some reason.)



I would
like to wrap this text in single quotes so as to prevent any variable
expansion or interpretation of the text of any kind. Additionally, I would
like to allow the users to include single quotes in their text without
requiring that they escape these quotes.

Something akin to the following would alleviate the need to communicate
that users must escape single quotes, but also provide the same literal
string behavior of single quotes.

presuming the arbitrarily substituted text is:

echo 'this command is specified by the user'

Then a syntax for this single quote heredoc behavior could be like:

$ sh -c <<^MAGIC_WORD echo 'this command is specified by the user'
MAGIC_WORD

Everything within the MAGIC_WORD declarations would not have command
substitution, variable expansion, etc, but would be treated as if it were
wrapped in single quotes with the exception that single quotes between the
MAGIC_WORDs need not be escaped.

Pardon my naïveté, does any such feature exist or are there good ways to
accomplish this? If not, is this something that could feasibly be
implemented? Would it be desirable?

Thanks,

Patrick




--
Ilkka Virta / itvi...@iki.fi



Re: shebang-less script execution not resetting some options

2019-10-02 Thread Ilkka Virta

On 2.10. 13:11, L A Walsh wrote:

On 2019/10/01 05:41, Greg Wooledge wrote:

On Tue, Oct 01, 2019 at 04:14:00AM -0700, L A Walsh wrote:
   

On 2019/09/30 14:39, Grisha Levit wrote:
 

A few of the recently-added shopt options aren't getting reset when
running a shebang-less script, this should fix it up:
   
   

Suppose the shebang-less script is being run by an earlier version
of bash.  Won't the new patch radically change the behavior of of
such programs?
 


Bash allows a child of itself (a subshell) to read the commands.
GNU find -exec uses /bin/sh to run it.
zsh and csh both use /bin/sh to run it, I think.
   


 So if a user has 'rbash' in /etc/passwd, they might get a real shell
because various programs ignore what /etc/passwd says?

 UmI suppose no one cares for one reason or another.


---
  2.9 Shell Commands
  Command Search and Execution

  If the command name does not contain any  characters, the first
  successful step in the following sequence shall occur:

  [a to d: functions, special builtins, stuff like that]

  e. Otherwise, the command shall be searched for using the PATH
 environment variable
 [...]
 b. Otherwise, the shell executes the utility in a separate utility
environment (see Shell Execution Environment) with actions
equivalent to calling the execl() function...

If the execl() function fails due to an error equivalent to the
[ENOEXEC] [...] the shell shall execute a command equivalent to
having a shell invoked with the pathname resulting from the
search as its first operand, [...]

[ 
https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/utilities/V3_chap02.html#tag_18_09_01_01 
]


-

I think that last sentence above is the relevant part. The standard only 
says to "invoke a shell". It doesn't say which shell, probably because 
it only specifies one.


Incidentally, it doesn't really specify the hashbang either. As far as I 
can tell, it only mentions it as one of two ways that implementations 
have "historically" recognized shell scripts. (The above being the other.)


[ 
https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/functions/execl.html 
]



As for rbash, does it matter? If you let a user exec() something, 
they'll get that binary, or the interpreter specified in the hashbang if 
it's a script. The kernel doesn't look at /etc/passwd to recognize rbash 
or such either, so if you want to restrict a user from launching 
particular commands, you'll have to do it before exec() is attempted.


That is to say, don't let those users run (other) unrestricted shells, 
or any of the various programs that allow forking off other programs, 
including find, xargs, many editors, etc.



--
Ilkka Virta / itvi...@iki.fi



Re: Documentation Bug Concerning Regular Expressions?

2019-09-23 Thread Ilkka Virta

On 23.9. 19:56, Hults, Josh wrote:

Hello Bash Maintainers,

In the currently posted version of the Bash documentation, there is a section regarding 
Conditional Constructs (3.2.4.2, 
https://www.gnu.org/savannah-checkouts/gnu/bash/manual/bash.html#Conditional-Constructs). 
 Within that section is a portion discussing the [[ ... ]] operator, and within that 
portion is a discussion of the "=~" regex operator.

The example given of a regex pattern is:  [[ $line =~ [[:space:]]*?(a)b ]].  
(This example is referenced twice)


Yes, that exact example has been discussed for a couple of days, 
starting at this message from last Friday:


https://lists.gnu.org/archive/html/bug-bash/2019-09/msg00042.html


--
Ilkka Virta / itvi...@iki.fi



Re: Wrong command option in the manual examples

2019-09-23 Thread Ilkka Virta

On 22.9. 21:15, Chet Ramey wrote:

On 9/20/19 10:24 PM, hk wrote:


Bash Version: 5.0
Patch Level: 0
Release Status: release

Description:
 On the section 3.2.6(GNU Parallel, page 16 in the pdf) of Bash
Reference Manual. The manual uses `find' command to illustrate
possible use cases of `parallel' as examples. but the option `-depth'
does not accept any argument, I think it means `-maxdepth` option
instead.


-depth n
   True if the depth of the file relative to the starting point of
   the traversal is n.

It's not in POSIX, and maybe GNU find doesn't implement it.


That seems to raise a question.

Isn't Bash a GNU project? Would it be prudent to use other GNU tools in 
examples, if standard POSIX features aren't sufficient? I can see that 
FreeBSD find has '-depth n' (as well the standard '-depth', somewhat 
confusingly) but should the reader of the manual be assumed to know the 
options supported by BSD utilities?


--
Ilkka Virta / itvi...@iki.fi



Re: Incorrect example for `[[` command.

2019-09-21 Thread Ilkka Virta

On 21.9. 21:55, Dmitry Goncharov wrote:

On Sat, Sep 21, 2019 at 12:34:39PM +0300, Ilkka Virta wrote:

[[:space:]]*?(a)b  isn't a well-defined POSIX ERE:

9.4.6 EREs Matching Multiple Characters

The behavior of multiple adjacent duplication symbols ( '+', '*', '?',
and intervals) produces undefined results.

https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/basedefs/V1_chap09.html


This is unfortunate.
*? and +? are widely used not greedy regexes.


In Perl-compatible regexes. Bash uses POSIX extended regular expressions.

And on a GNU system, while *? and +? don't give errors when used in an 
ERE, they still don't make the repetition non-greedy. They just act the 
same as a single * (as far as I can tell anyway).


 bash$ re='<.+?>'
 bash$ [[ "ace" =~ $re ]] && echo $BASH_REMATCH
 c
 bash$ [[ "a<>e" =~ $re ]] && echo $BASH_REMATCH
 <>

--
Ilkka Virta / itvi...@iki.fi



Re: Incorrect example for `[[` command.

2019-09-21 Thread Ilkka Virta

On 21.9. 03:12, hk wrote:

Thanks for the reply. I was wrong in my report. It does match values like
aab and  aab  in its original form.


In some systems, yes. (It does that on my Debian, but doesn't work at 
all on my Mac.)


It is syntatically correct as a regular expression. 


[[:space:]]*?(a)b  isn't a well-defined POSIX ERE:

  9.4.6 EREs Matching Multiple Characters

  The behavior of multiple adjacent duplication symbols ( '+', '*', '?',
  and intervals) produces undefined results.

https://pubs.opengroup.org/onlinepubs/9699919799.2018edition/basedefs/V1_chap09.html


--
Ilkka Virta / itvi...@iki.fi



Re: Incorrect example for `[[` command.

2019-09-20 Thread Ilkka Virta

On 20.9. 21:39, Chet Ramey wrote:


The portion of the manual before the example explains BASH_REMATCH and
BASH_REMATCH[0]. It also says "a sequence of characters in the value..."
when describing the pattern. 


Yeah, though the preceding paragraph contains both the general 
description of the regex match, and the mention of BASH_REMATCH, so the 
BASH_REMATCH angle could be a bit more explicit.


So I'd probably say that the pattern would match e.g. 'xxx aabyyy', or 
'xxxbyyy' and set $BASH_REMATCH to ' aab', or 'b', respectively. And 
then mention that the ^ and $ anchors could be used.


I know the usual regex behavior is to find a match anywhere within the 
value, but since it's exactly the opposite of how regular pattern 
matches work, it's probably worth mentioning in some way.  (Though I do 
think it's better to document things rather explicitly in general.)



--
Ilkka Virta / itvi...@iki.fi



Re: Incorrect example for `[[` command.

2019-09-20 Thread Ilkka Virta

On 20.9. 15:48, Greg Wooledge wrote:

but after the regex-glob-thing, it says:

   That means values like ‘aab’ and ‘ aab’ will match

So there's a shift in intent between a? and a+ in what's supposed to be
a regular expression.  Although of course the sentence is *literally*
true because the regex would be unanchored, and therefore it's sufficient
to match only the 'ab', and the rest of the input doesn't matter.
But that's just confusing, and doesn't belong in this kind of document.


It goes on to say "as will a line containing a 'b' anywhere in its 
value", so the text does recognize the zero-width-matching parts don't 
affect what matches. (I suppose they would affect what goes to 
BASH_REMATCH[0], but the text doesn't mention that.)


I think it would be a better example with the anchored version also 
presented for comparison.


--
Ilkka Virta / itvi...@iki.fi



Re: Have var+func sourced in a subroutine but they don't seem to end up in same scope

2019-07-29 Thread Ilkka Virta

On 29.7. 09:25, L A Walsh wrote:

The library-include function allows me to source a library file
that is in a relative path off of PATH (a feature not in bash,
unfortunately).


[...]


I tried putting exporting the data and the function with export
but it ended up the same.  The variables weren't defined in the
same scope as the function.


Are you sourcing some other script, or running it as a regular program?

Because above, you say 'source', which would indicate running code from 
another file in the same shell, but then you talk about exporting, which 
really only matters when starting a new process (as far as I know).



An example that would actually run and demonstrate the issue might make 
it easier to see what's actually going on.



--
Ilkka Virta / itvi...@iki.fi



Re: expression evaluation problem

2019-07-25 Thread Ilkka Virta

On 24.7. 21:43, L A Walsh wrote:

 Does it improve execution time?  That's more of a concern here than
readability, since it is an expression fragment, it isn't meant to be
understood
in isolation.

[...]

 The important part for me is whether or not it is faster to perform
1 calculation, or 100.  So which would be faster?  In this case
execution speed
is more important than clarity.  I consider that a 'constraint'.


Shouldn't it be easy enough to measure how much restructuring that 
expression affects the execution speed?


Also, regarding execution speed, I've been led to believe that the shell 
in general is rather slow, and almost any other language would be better 
if that is of concern. (awk, Perl, what have you.)



--
Ilkka Virta / itvi...@iki.fi



Re: Combination of "eval set -- ..." and $() command substitution is slow

2019-07-16 Thread Ilkka Virta

On 15.7. 20:49, Robert Elz wrote:


printf '%s\n' "`printf %s "$i"`"
printf '%s\n' "$(printf %s "$i")"

aren't actually the same.   In the first $i is unquoted, in the second it is
quoted.   


Huh, really? It looks to me like the first one treats $i as quoted too:

 $ touch file.txt; i='123 *'
 $ printf '%s\n' "`printf :%s: "$i"`"
 :123 *:

But not here, of course:

 $ printf '%s\n' "`printf :%s: $i`"
 :123::file.txt:

I tried with Bash and some other shells, but couldn't find one where the 
result was different. Did I miss something?



--
Ilkka Virta / itvi...@iki.fi



Re: Arithmetic evaluation of negative numbers with base prefix

2019-06-18 Thread Ilkka Virta

On 18.6. 18:20, Greg Wooledge wrote:

On Tue, Jun 18, 2019 at 10:27:48AM -0400, Chet Ramey wrote:

$ ksh93 -c 'echo ${.sh.version}'
Version ABIJM 93v- 2014-09-29
$ ksh93 -c 'echo $(( 10# ))'
ksh93:  10# : arithmetic syntax error


I guess most Linux distributions are not shipping the 2014 version of
ksh93 yet...?


Yeah, I had the one from Debian. I'm not even sure what the current 
version of ksh is.


At least the newer versions throw an error instead of silently doing the 
unexpected.



wooledg:~$ ksh -c 'echo $(( 10# ))'
0
wooledg:~$ dpkg -l ksh | tail -1
ii  ksh93u+20120801-3.4 amd64Real, AT version of the Korn 
shell
wooledg:~$ ksh -c 'echo ${.sh.version}'
Version AJM 93u+ 2012-08-01

Seems kinda weird to continue calling it "ksh93" if it's being changed,
but I don't make the decisions.




--
Ilkka Virta / itvi...@iki.fi



Re: Arithmetic evaluation of negative numbers with base prefix

2019-06-17 Thread Ilkka Virta

On 17.6. 18:47, Greg Wooledge wrote:

On Mon, Jun 17, 2019 at 02:30:27PM +0100, Jeremy Townshend wrote:

In the meantime it would seem cautionary to advise against the pitfall of
using base# prefixed to variables (contrary to
mywiki.wooledge.org/ArithmeticExpression) unless you can be confident that
they will never be decremented below zero.


Fair point.  I've updated <https://mywiki.wooledge.org/ArithmeticExpression>
and <https://mywiki.wooledge.org/BashPitfalls>.


Good!

I still wish this could be fixed to do the useful thing without any 
workarounds, given it's what ksh and zsh do, and since this is the 
second time it comes up on the list, it appears to be surprising to 
users, too.


The # prefix is already an extension of the C numeric constant 
syntax, so extending it further to include an optional sign wouldn't 
seem in inappropriate.



I took a look last night and made some sort of a patch. It seems to 
work, though I'm not sure if I've missed any corner cases. Apart from 
the digitless '10#', the behaviour matches ksh and zsh, I made it an 
error, they apparently allow it.


  $ cat test.sh
  echo $(( 10 * 10#-123 ))  # -1230
  echo $(( 10 * 10#-008 ))  #   -80
  echo $(( 10 * 10#1+23 ))  #10*1 + 23 = 33
  echo $(( 10# ))   #  error

  $ ./bash test.sh
  -1230
  -80
  33
  test.sh: line 5: 10#: no digits in number (error token is "10#")

  $ ksh test.sh
  -1230
  -80
  33
  0


--
Ilkka Virta / itvi...@iki.fi
--- expr.c.orig 2018-12-17 16:32:21.0 +0200
+++ expr.c  2019-06-18 08:30:31.110851666 +0300
@@ -1386,8 +1386,12 @@ readtok ()
 }
   else if (DIGIT(c))
 {
-  while (ISALNUM (c) || c == '#' || c == '@' || c == '_')
-   c = *cp++;
+  unsigned char prevc = c;
+  while (ISALNUM (c) || c == '#' || c == '@' || c == '_' || ((c == '+' || 
c == '-') && prevc == '#')) 
+{
+  prevc = c;
+  c = *cp++;
+}
 
   c = *--cp;
   *cp = '\0';
@@ -1531,6 +1535,8 @@ strlong (num)
   register char *s;
   register unsigned char c;
   int base, foundbase;
+  int digits = 0;
+  char sign = 0;
   intmax_t val;
 
   s = num;
@@ -1569,8 +1575,16 @@ strlong (num)
 
  base = val;
  val = 0;
+ digits = 0;
  foundbase++;
}
+  else if (c == '-' || c == '+')
+{
+  if (digits > 0 || sign != 0)
+evalerror (_("invalid number"));
+  
+  sign = c;
+}
   else if (ISALNUM(c) || (c == '_') || (c == '@'))
{
  if (DIGIT(c))
@@ -1588,11 +1602,18 @@ strlong (num)
evalerror (_("value too great for base"));
 
  val = (val * base) + c;
+ digits++;
}
   else
break;
 }
 
+  if (sign == '-')
+val *= -1;
+  
+  if (digits == 0)
+evalerror (_("no digits in number"));
+
   return (val);
 }
 


Re: Arithmetic evaluation of negative numbers with base prefix

2019-06-14 Thread Ilkka Virta

On 14.6. 17:19, Jeremy Townshend wrote:

echo $((10#-1))   # -1 as expected


Earlier discussion about the same on bug-bash:
https://lists.gnu.org/archive/html/bug-bash/2018-07/msg00015.html

Bash doesn't support the minus (or plus) sign following the 10#.
I think the expression above seems to work in this case because 10# is 
treated as a constant number by itself (with a value of 0), and then the 
1 is subtracted.


try also e.g.:

  $ echo $((10#))
  0


echo $((0-10#-1)) # -1 UNEXPECTED. Would expect 1.


So this is 0-0-1 = -1

--
Ilkka Virta / itvi...@iki.fi



Re: Code Execution in Mathematical Context

2019-06-06 Thread Ilkka Virta

On 6.6. 15:53, Greg Wooledge wrote:

wooledg:~$ echo $(( a[$i] ))
Tue 04 Jun 2019 09:23:28 AM EDT
0



wooledg:~$ echo $(( 'a[$i]' ))
bash: 'a[$(date >&2)]' : syntax error: operand expected (error token is "'a[$(date 
>&2)]' ")


I definitely got different results when I added single quotes.


Well, yes... The point I was trying to make before (and which Chet 
seemed to confirm) is that the quotes break the $((..)) expression 
completely, even if 'i' there is just a number, and not any fancier than 
that.


  $ a=(123 456 789)
  $ i=2
  $ echo $(( a[$i] ))
  789
  $ echo $(( 'a[$i]' ))
  bash5.0.3: 'a[2]' : syntax error: operand expected (error token is 
"'a[2]' ")

  $ echo $(( 'a[2]' ))
  bash5.0.3: 'a[2]' : syntax error: operand expected (error token is 
"'a[2]' ")


And with ((..)), the command substitution runs with a slightly different 
i, quotes or not and with $i or i:


  $ i='a[$(date >&2)]'
  $ (( a[$i]++ ))
  Thu Jun  6 16:46:06 EEST 2019
  bash5.0.3: a[]: bad array subscript
  bash5.0.3: a[]: bad array subscript
  $ (( a[i]++ ))
  Thu Jun  6 16:46:12 EEST 2019
  $ (( 'a[$i]++' ))
  Thu Jun  6 16:46:18 EEST 2019
  $ (( 'a[i]++' ))
  Thu Jun  6 16:46:31 EEST 2019


So if we want "valid" values in the index to work, and invalid ones to 
not do anything nasty, I can't seem to find a case where quotes would 
help with that.


--
Ilkka Virta / itvi...@iki.fi



Re: Code Execution in Mathematical Context

2019-06-05 Thread Ilkka Virta

On 5.6. 17:05, Chet Ramey wrote:

On 6/4/19 3:26 PM, Ilkka Virta wrote:

If the bad user supplied variable contains array indexing in itself, e.g.
bad='none[$(date >&2)]' then using it in an arithmetic expansion still
executes the 'date', single quotes or not (the array doesn't need to exist):


Because the value is treated as an expression, not an integer constant.


And I suppose that's by design, or just required by the arithmetic 
expression syntax, right? I think that was part of the original question.



   $ (( 'bad' ))
   Tue Jun  4 22:04:32 EEST 2019


Quoting a string doesn't make it a non-identifier in this context.


So is there some other "simple" way of preventing that, then?


   $ echo "$(( 'a[2]' ))"
   bash: 'a[2]' : syntax error: operand expected (error token is "'a[2]' ")


The expression between the parens is treated as if it were within double
quotes, where single quotes are not special.


I did put the double-quotes around the $((...)), but the same happens 
even without them. Is this just a difference between ((...)) and 
$((...)) for some reason?


--
Ilkka Virta / itvi...@iki.fi



Re: Code Execution in Mathematical Context

2019-06-04 Thread Ilkka Virta

On 4.6. 16:24, Greg Wooledge wrote:

On Tue, Jun 04, 2019 at 01:42:40PM +0200, Nils Emmerich wrote:

Bash Version: 5.0
Patch Level: 0
Release Status: release

Description:
         It is possible to get code execution via a user supplied variable in
the mathematical context.



For example:  (( 'a[i]++' ))   or   let 'a[i]++'



Without quotes in the former, something bad happens, but I can't remember
the details off the top of my head.


If the bad user supplied variable contains array indexing in itself, 
e.g. bad='none[$(date >&2)]' then using it in an arithmetic expansion 
still executes the 'date', single quotes or not (the array doesn't need 
to exist):


  $ a=(123 456 789) bad='none[$(date >&2)]'
  $ unset none
  $ (( a[bad]++ ))
  Tue Jun  4 22:00:38 EEST 2019
  $ (( 'a[bad]++' ))
  Tue Jun  4 22:00:42 EEST 2019

Same here, of course:

  $ (( bad ))
  Tue Jun  4 22:04:29 EEST 2019
  $ (( 'bad' ))
  Tue Jun  4 22:04:32 EEST 2019

So, it doesn't seem the single-quotes help. They do seem to break the 
whole expression within "$(( ))", though:


  $ echo "$(( 'a[2]' ))"
  bash: 'a[2]' : syntax error: operand expected (error token is "'a[2]' ")
  $ i=2
  $ echo "$(( 'a[i]' ))"
  bash: 'a[i]' : syntax error: operand expected (error token is "'a[i]' ")
  $ echo "$(( 'a[$i]' ))"
  bash: 'a[2]' : syntax error: operand expected (error token is "'a[2]' ")


Maybe it would be better to try to sanity-check any user-provided values 
first:


  $ case $var in *[^0123456789]*) echo "Invalid input" >&2; exit 1;; esac
  $ (( a[var]++ ))  # safe now?


--
Ilkka Virta / itvi...@iki.fi



Re: Arithmetic expansion with increments and output redirection

2019-04-24 Thread Ilkka Virta

On 24.4. 16:37, Chet Ramey wrote:

"Utilities other than the special built-ins (see Special Built-In
Utilities) shall be invoked in a separate environment that consists of the
following...[includes redirections specified to the utility]...


It does say

"Open files inherited on invocation of the shell, open files controlled 
by the exec special built-in plus any modifications, and additions 
specified by any redirections to the utility"


which could also be read to apply only the open files themselves, not 
the byproducts of finding out their names.



The
environment of the shell process shall not be changed by the utility"


It's not the utility that changes the environment when processing the 
expansion, but the shell itself, isn't it?


- -

Anyway, as little as it's worth, Zsh seems to do it the same way Bash 
does, all others leave the changed value visible.


 $ for shell in 'busybox sh' dash yash ksh93 mksh bash zsh; do $shell -c
   'i=1; /bin/echo foo > $(( i += 1 )); printf "%-15s %s\n" "$1:" "$i";'
   sh "$shell"; done
 busybox sh: 2
 dash:   2
 yash:   2
 ksh93:  2
 mksh:   2
 bash:   1
 zsh:1


I also did find the Bash/Zsh behaviour a bit surprising. But I'm not 
sure it matters other than here and with stuff like $BASHPID? It's easy 
to work around here by splitting the increment/decrement to a separate line:


 /bin/echo foo > "$i"
 : "$(( i += 1 ))"

Some find that easier to read, too: the increment isn't "hidden" within 
the other stuff on the command line.



--
Ilkka Virta / itvi...@iki.fi



Re: bug: illegal function name?

2019-01-20 Thread Ilkka Virta

In POSIX mode, Bash disallows names like '1a':

 13. Function names must be valid shell names. That is, they may not
 contain characters other than letters, digits, and underscores, and may
 not start with a digit. Declaring a function with an invalid name
 causes a fatal syntax error in non-interactive shells.

 14. Function names may not be the same as one of the POSIX special
 builtins.

https://www.gnu.org/software/bash/manual/html_node/Bash-POSIX-Mode.html#Bash-POSIX-Mode

The rules are more lax if POSIX mode is not set, but there's nothing 
that requires using nonstandard function names even in that case.



The manual could of course mention something about the accepted function 
names, e.g.  "Function names can contain the characters [...], except in 
POSIX mode, where they must be valid shell /names/."  I'm not exactly 
sure what the accepted characters are, though, so I can't really suggest 
anything concrete.



On 20.1. 17:26, Andrey Butirsky wrote:

Andreas, I know it will work with the '-f' flag.
But for others function names, the '-f' unset flag is not required.
Moreover, it seem confronts with Open Group Base Specification.
So I consider it as a bug still.

On 20.01.2019 18:18, Andreas Schwab wrote:

On Jan 20 2019, Andrey Butirsky  wrote:


|$ unset 1a ||
||bash: unset: `1a': not a valid identifier

Use `unset -f'.

Andreas.







--
Ilkka Virta / itvi...@iki.fi



Re: Difference of extglob between 5.0.0(1)-release and 4.4.23(1)-release

2019-01-13 Thread Ilkka Virta

On 13.1. 14:37, Andreas Schwab wrote:

On Jan 13 2019, Robert Elz  wrote:


The pattern
./$null"$dir"/

is expanded (parameter expansion) to

./@()./

which does not have a "." immediately after the / and
tus cannot match any filename (incoludeing ".") which
starts with a '.' character.


For the same reason `*.' doesn't match `.'.  Making `@()' work differently
from `*' would be surprising.


However,  ?(aa).foo  matches the file  .foo  in Bash 4.4 and 5.0 (and 
also in Ksh and Zsh), so extglob already breaks the above mentioned rule.


  $ touch .foo aa.foo; bash -O extglob -c 'echo ?(aa).foo'
  aa.foo .foo


The change in Bash 5.0 also makes  @(aa|)  different from  ?(aa) , even 
though the distinction between those two doesn't appear immediately 
obvious.



--
Ilkka Virta / itvi...@iki.fi



Re: built-in '[' and '/usr/bin/[' yield different results

2018-11-13 Thread Ilkka Virta

On 13.11. 18:29, Service wrote:

     # Put the above commands into a script, say check.sh
     # Run with: /bin/sh < check.sh
     # Or  : /bin/sh ./check.sh
     # Or  : /usr/bin/env ./check.sh

     # Output is always not ok:
     not_nt
     nt


 $ cat check.sh
 export PATH=""
 /bin/touch file1
 /bin/rm -f file2
 if  [ file1 -nt file2 ]; then echo nt; else echo not_nt; fi
 if /usr/bin/[ file1 -nt file2 ]; then echo nt; else echo not_nt; fi

 $ bash ./check.sh
 nt
 nt

 $ /bin/sh ./check.sh
 not_nt
 nt

Isn't that Windows Linux thingy based on Ubuntu? /bin/sh isn't Bash by 
default on Debian and Ubuntu, so it might be you're just not running the 
script with Bash.



--
Ilkka Virta / itvi...@iki.fi



Re: GNUbash v. 4.4.23-5 – Bash identifier location is non-correct in terminal

2018-10-29 Thread Ilkka Virta

On 29.10. 12:40, Ricky Tigg wrote:

Actual result:

$ curl https://www.startpage.com
(...)  [yk@localhost ~]$


The shell just prints the prompt where ever the cursor was left. That's 
quite common, the only exception I know is zsh, which moves the cursor 
to the start of line if the previous command didn't leave it in the left 
edge.


A simple workaround would be to add '\n' at the start of the prompt, but 
it would then print an empty line above the prompt for every command 
that does properly finish the output with a newline. Some might find 
that ugly.


It might be possible to check for that manually in PROMPT_COMMAND. 
Something like this seems to mostly work for me in interactive use, 
though it's rather stupid and will probably break down at some point.


  prompt_to_bol() { local pos; printf '\e[6n'; read -sdR pos;
  [[ ${pos#*;} != 1 ]] && printf '\e[30;47m%%\n\e[0m'; }
  PROMPT_COMMAND=prompt_to_bol

(I stole the main parts from the answers in 
https://unix.stackexchange.com/q/88296/170373 )



--
Ilkka Virta / itvi...@iki.fi



Re: Strange (wrong?) behaviour of "test ! -a file"

2018-10-21 Thread Ilkka Virta

On 21.10. 20:03, Chet Ramey wrote:

The help text for test says

"The behavior of test depends on the number of arguments.  Read the
bash manual page for the complete specification."


Can I suggest adding that note from the help text to the manual under 
"Bash Conditional Expressions" too? Perhaps also explicitly noting the 
clash with the binary operators in the descriptions of -a and -o, too.


Something along the lines of the attached patch.

I also think the description of the 3-argument test would be clearer 
with numbering, so I added that too.


--
Ilkka Virta / itvi...@iki.fi
--- bashref.texi.orig   2018-10-21 20:45:31.909941736 +0300
+++ bashref.texi2018-10-21 21:09:24.302551079 +0300
@@ -3795,18 +3795,25 @@
 
 @item 3 arguments
 The following conditions are applied in the order listed.
+
+@enumerate
+@item
 If the second argument is one of the binary conditional
 operators (@pxref{Bash Conditional Expressions}), the
 result of the expression is the result of the binary test using the
 first and third arguments as operands.
 The @samp{-a} and @samp{-o} operators are considered binary operators
 when there are three arguments.
+@item
 If the first argument is @samp{!}, the value is the negation of
 the two-argument test using the second and third arguments.
+@item
 If the first argument is exactly @samp{(} and the third argument is
 exactly @samp{)}, the result is the one-argument test of the second
 argument.
+@item
 Otherwise, the expression is false.
+@end enumerate
 
 @item 4 arguments
 If the first argument is @samp{!}, the result is the negation of
@@ -6821,7 +6828,9 @@
 @cindex expressions, conditional
 
 Conditional expressions are used by the @code{[[} compound command
-and the @code{test} and @code{[} builtin commands.
+and the @code{test} and @code{[} builtin commands. The behavior of
+the @code{test} and @code{[} builtins depend on the number of arguments.
+See their descriptions in @ref{Bourne Shell Builtins} for details.
 
 Expressions may be unary or binary.
 Unary expressions are often used to examine the status of a file.
@@ -6846,7 +6855,8 @@
 
 @table @code
 @item -a @var{file}
-True if @var{file} exists.
+True if @var{file} exists. Note that this may be interpreted as the binary
+@samp{-a} operator if used with the @code{test} or @code{[} builtins.
 
 @item -b @var{file}
 True if @var{file} exists and is a block special file.
@@ -6924,6 +6934,8 @@
 True if the shell option @var{optname} is enabled.
 The list of options appears in the description of the @option{-o}
 option to the @code{set} builtin (@pxref{The Set Builtin}).
+Note that this may be interpreted as the binary @samp{-o} operator
+if used with the @code{test} or @code{[} builtins.
 
 @item -v @var{varname}
 True if the shell variable @var{varname} is set (has been assigned a value).


Re: comment on RFE: 'shift'' [N] ARRAYNAME

2018-09-27 Thread Ilkka Virta

On 27.9. 15:35, Greg Wooledge wrote:

Shift already takes one optional argument: the number of items to shift
from the argv list.  Adding a second optional argument leads to a quagmire.
Do you put the optional list name first, or do you put the optional number
first?  If only one argument is given, is it a list name, or is it a number?

(OK, granted, in bash it is not permitted to create an array whose name
is strictly digits, but still.)


Can you make an array whose name even starts with a digit?

With no overlap between array names and valid numbers,
shift [arrayname] [n]   would be unambiguous, as you said.

Though  shift [n [arrayname]]   would be even more backward-compatible 
since the new behaviour would always require two arguments, which is now 
an error.



Even so, deciding how to handle sparse arrays might an interesting 
issue, too.


If one wants a command that looks like the current shift, Dennis's 
obvious slice-assignment could be wrapped in a function. Doing it this 
way of course collapses the indices to consecutive numbers starting at zero.


 ashift() {
 typeset -n _arr_="$1";
 _arr_=("${_arr_[@]:${2-1}}");
 }
 somearray=(a b c d)
 ashift somearray 2


--
Ilkka Virta / itvi...@iki.fi



Re: bash sockets: printf \x0a does TCP fragmentation

2018-09-22 Thread Ilkka Virta

On 22.9. 02:34, Chet Ramey wrote:

Newline? It's probably that stdout is line-buffered and the newline causes
a flush, which results in a write(2).


Mostly out of curiosity, what kind of buffering logic does Bash (or the 
builtin printf in particular) use? It doesn't seem to be the usual stdio 
logic where you get line-buffering if printing to a terminal and block 
buffering otherwise. I get a distinct write per line even if the stdout 
of Bash itself is redirected to say /dev/null or a pipe:


 $ strace -etrace=write bash -c 'printf "foo\nbar\n"' > /dev/null
 write(1, "foo\n", 4)= 4
 write(1, "bar\n", 4)    = 4
 +++ exited with 0 +++


--
Ilkka Virta / itvi...@iki.fi



Re: bash sockets: printf \x0a does TCP fragmentation

2018-09-22 Thread Ilkka Virta

On 22.9. 12:50, dirk+b...@testssl.sh wrote:

cat has a problem with binary chars, right? And: see below.


No, it just loops with read() and write(), it shouldn't touch any of the 
bytes (except for cat -A and such). But it probably doesn't help in 
coalescing the write blocks, it's likely to just write() whatever it 
gets immediately.


And you can't really solve the issue at hand by piping to any 
intermediate program, as that program couldn't know how long to buffer 
the input. Unless you use something that buffers for a particular amount 
of time, which of course causes a unnecessary delay.


The coreutils printf seems to output 'foo\nbar\n' as a single write, 
though (unless it goes to the terminal, so the usual stdio buffering), 
so you might be able to use that.



In any case, if a TCP endpoint cares about getting full data packets 
within a single segment, I'd say it's broken.


--
Ilkka Virta / itvi...@iki.fi



Re: Add sleep builtin

2018-08-22 Thread Ilkka Virta

On 22.8. 15:22, Greg Wooledge wrote:

Just for the record, the POSIX sleep command only accepts an "integral
number of seconds specified by the time operand."  Sub-second sleep(1)
is a GNUism.


Or ksh-ism? (Or does it even matter which one it is originally, since 
Bash is GNU Bash with features common with ksh.)


Regardless, subsecond sleeping is useful and Bash already supports 
handling subsecond times in at least 'read -t', so it's not that far 
fetched to implement it in 'sleep' too.



I'd still suggest supporting the features in GNU coreutils sleep, if an 
enabled-by-default builtin sleep is implemented. Just for those systems 
that happen to have both GNU Bash and GNU coreutils installed.


--
Ilkka Virta / itvi...@iki.fi



Re: Add sleep builtin

2018-08-21 Thread Ilkka Virta

On 21.8. 14:34, konsolebox wrote:

Also it's basically
people's fault for not reading documentation.  One should be aware
enough if they enable the builtin.


Yes, if it's not enabled by default.


--
Ilkka Virta / itvi...@iki.fi



Re: Add sleep builtin

2018-08-21 Thread Ilkka Virta

On 21.8. 02:35, Chet Ramey wrote:

I don't think there's a problem with a `syntax conflict' as long as any
builtin sleep accepts a superset of the POSIX options for sleep(1).


The sleep in GNU coreutils accepts suffixes indicating minutes, hours
and days (e.g.  sleep 1.5m  or  sleep 1m 30s  for 90 seconds). I didn't
see support for those in konsolebox's patch, so while that's not
conflicting syntax per se, the lack of that option might trip someone.


--
Ilkka Virta / itvi...@iki.fi




Re: Rational Range Interpretation for bash-5.0?

2018-08-15 Thread Ilkka Virta

On 6.8. 23:07, Chet Ramey wrote:

Hi. I am considering making bash glob expansion implement rational range
interpretation starting with bash-5.0 -- basically making globasciiranges
the default. It looks like glibc is going to do this for version 2.28 (at
least for a-z, A-Z, and 0-9), and other GNU utilities have done it for some
time. What do folks think?


I tried to think about a counterpoint, some case for where the current
(non-globasciiranges) behaviour would be useful, but I can't come up
with any. At least the part where [a-z] matches A, but not Z makes it a
bit useless.

If you're considering special-casing just those three, I'd suggest 
adding a-f and A-F too, for patterns matching hex digits.


So yeah, +1 from me.

--
Ilkka Virta / itvi...@iki.fi



Re: Assignment of $* to a var removes spaces on unset IFS.

2018-08-15 Thread Ilkka Virta

On 15.8. 15:44, Greg Wooledge wrote:

glob() {
 # "Return" (write to stdout, one per line) the expansions of all
 # arguments as globs against the current working directory.
 printf %s\\n $*
}

But... but... but... the PREVIOUS glob worked!  Why didn't this one
work?


I'm sure you know what word splitting is.


I'll leave the non-broken implementation as an exercise for the reader.


$ glob() { local IFS=; printf '%s\n' $*; }
$ touch "foo bar.txt" "foo bar.pdf"
$ glob "foo bar*"
foo bar.pdf
foo bar.txt

(Well, you'd probably want 'nullglob' too, and there's the minor issue
that printf '%s\n'  prints at least one line even if there are no
arguments but I'll ignore that for now.)


Of course, in most cases, and unquoted expansion is not what one wants,
but if there's need to glob in the shell, then an unquoted expansion is
what has to be used. How IFS affects word splitting isn't just about
$* , the issue is the same even if you only have one glob in a regular
variable.

--
Ilkka Virta / itvi...@iki.fi



Re: Tilde expansion in assignment-like context

2018-08-06 Thread Ilkka Virta

On 6.8. 22:45, Chet Ramey wrote:

Yes. Bash has done this since its earliest days. A word that looks like an
assignment statement has tilde expansion performed after unquoted =~ and :~
no matter where it appears on the command line. 


Given that options starting with a double-dashes (--something=/some/dir) 
are rather common, would it make sense to extend tilde expansion to 
apply in that case too?


Of course, getopt_long() supports giving the option argument in a 
separate command-line argument, so you can work around it with that.



Also, does the documentation actually say tilde expansion applies in 
anything that looks like an assignment? I can only see "If a word begins 
with an unquoted tilde character..." and "Each variable assignment is 
checked for unquoted tilde-prefixes...", but from the shell language 
point of view, the one in 'make DESTDIR=~stager/bash-install' isn't an 
assignment, just a regular command line argument.


The paragraph about assignments could be expanded to say "This applies 
also to regular command-line arguments that look like assignments." or 
something like that.



--
Ilkka Virta / itvi...@iki.fi



Re: Unquoted array slice ${a[@]:0} expands to just one word if IFS doesn't have a space

2018-08-01 Thread Ilkka Virta

On 1.8. 15:12, Greg Wooledge wrote:

On Wed, Aug 01, 2018 at 02:43:27PM +0300, Ilkka Virta wrote:

On both Bash 4.4.12(1)-release and 5.0.0(1)-alpha, a subarray slice like
${a[@]:0} expands to just one word if unquoted (and if IFS doesn't
contain a space):


This just reinforces the point that unquoted $@ or $* (or the array
equivalent) is a bug in the script.  It gives unpredictable results.


Unquoted $* seems well-defined in Bash's reference manual:

  ($*) Expands to the positional parameters, starting from one. When the
  expansion is not within double quotes, each positional parameter
  expands to a separate word.

The reference doesn't really say anything about an unquoted $@, but then 
there's the POSIX definition which should be well-defined in this case, 
since clearly field-splitting should be performed here.


  @: Expands to the positional parameters, starting from one, initially
  producing one field for each positional parameter that is set. When
  the expansion occurs in a context where field splitting will be
  performed, any empty fields may be discarded and each of the non-empty
  fields shall be further split as described in Field Splitting.


Now, of course POSIX doesn't say anything about arrays or the 
subarray/slice notation, but then Bash's reference mentions that
[@] and [*] are supposed to be analoguous to $@ and $*, and the 
description of ${parameter:offset:length} doesn't say that 
${array[@]:n:m} would act differently from ${array[@]} let alone 
differently from ${@:n:m}.


Instead, the wording of the subarray/slice expansion is similar for both 
${@:n:m} and ${array[@]:n:m}:


  ${parameter:offset:length}

  If parameter is ‘@’, the result is length positional parameters
  beginning at offset.

  If parameter is an indexed array name subscripted by ‘@’ or ‘*’, the
  result is the length members of the array beginning with
  ${parameter[offset]}.


It doesn't say what's done with those parameters or array members, but 
if the behaviour is supposed to be different between these two cases, 
it's not documented.



--
Ilkka Virta / itvi...@iki.fi




Unquoted array slice ${a[@]:0} expands to just one word if IFS doesn't have a space

2018-08-01 Thread Ilkka Virta

On both Bash 4.4.12(1)-release and 5.0.0(1)-alpha, a subarray slice like
${a[@]:0} expands to just one word if unquoted (and if IFS doesn't
contain a space):

$ a=(aa bb); IFS=x; printf ":%s:\n" ${a[@]:0}
:aa bb:


I expected it would expand to separate words, as it does without the 
slice, and just like $@ does, sliced or not:


$ a=(aa bb); IFS=x; printf ":%s:\n" ${a[@]}
:aa:
:bb:
$ set -- aa bb; IFS=x; printf ":%s:\n" $@
:aa:
:bb:
$ set -- aa bb; IFS=x; printf ":%s:\n" ${@:1}
:aa:
:bb:


It's as if it first joins the picked elements with spaces, and then 
splits using IFS, instead of producing multiple words and word-splitting 
them individually.


The same thing happens with ${a[*]:0} (but not with ${*:1}):
the array elements get joined with spaces to a single word. If IFS is 
empty, unset, or contains a space the result is multiple words as 
expected with both [@] and [*].



An expansion like that should in most cases be quoted,
but the current behaviour still seems a bit inconsistent.


--
Ilkka Virta / itvi...@iki.fi



Re: Empty ""s in ARG in ${x:+ARG} expand to no words instead of the empty word if prepended/appended with space

2018-07-21 Thread Ilkka Virta

On 21.7. 07:44, Bob Proulx wrote:

Denys Vlasenko wrote:

$ f() { for i; do echo "|$i|"; done; }
$ x=x
$ e=
$ f ${x:+ ""}
^^^ prints nothing, bug?

$  ${x:+"" }
^^^ prints nothing, bug?


Insufficient quoting.  That argument should be quoted to avoid the
whitespace getting stripped.  (Is that during word splitting phase
using the IFS?  I think so.)

Try this:

   f "${x:+ ""}"
   f "${x:+"" }"


That's not the same at all. With outer quotes, the result will always be 
a single word. Without them, having an empty 'x' would result in no word:



Without outer quotes:

$ for cond in "" "1" ; do for value in "" "*" ; do printf "<%s>\t" 
"$cond" "$value" ${cond:+"$value"}; echo; done; done

<>  <>
<>  <*>
<1> <>  <>
<1> <*> <*>


With outer quotes:

$ for cond in "" "1" ; do for value in "" "*" ; do printf "<%s>\t" 
"$cond" "$value" "${cond:+"$value"}"; echo; done; done

<>  <>  <>
<>  <*> <>
<1> <>  <>
<1> <*> <*>


I suppose that could be used to pass optional arguments to some command.

Though different shells do behave a bit differently here, and I'm not 
sure which behaviour is the correct one. With the values from the third 
line in the above test (the other three seem consistent), different shells:



No extra spaces in ${cond:+"$value"}:

$ for shell in bash dash ksh "zsh -y" ; do $shell -c 'cond=1; value=""; 
printf "<%s> " "$0" ${cond:+"$value"}; echo;' ; done

 <>
 <>

 <>


Extra spaces in ${cond:+ "$value" }:

$ for shell in bash dash ksh "zsh -y" ; do $shell -c 'cond=1; value=""; 
printf "<%s> " "$0" ${cond:+ "$value" }; echo;' ; done


 <>
 <>
 <>


Or with multiple words inside:

$ for shell in bash dash ksh "zsh -y" ; do $shell -c 'cond=1; printf 
"<%s> " "$0" ${cond:+"" "x" ""}; echo;' ; done

 
 <>  <>
 <> 
 <>  <>


It doesn't seem like a very good idea to rely on this, arrays would of 
course work better.



Bash: GNU bash, version 4.4.12(1)-release (x86_64-pc-linux-gnu)
ksh:version sh (AT Research) 93u+ 2012-08-01
zsh:  zsh 5.3.1 (x86_64-debian-linux-gnu)
dash: Debian's 0.5.8-2.4


--
Ilkka Virta / itvi...@iki.fi



Re: Number with sign is read as octal despite a leading 10#

2018-07-10 Thread Ilkka Virta

On 10.7. 18:09, Chet Ramey wrote:

On 7/10/18 6:44 AM, Ilkka Virta wrote:

I think the problematic case here is when the number comes as input from
some program, which might or might not print a leading sign or leading
zeroes, but when we know that the number is, in any case, decimal.

E.g. 'date' prints leading zeroes, which is easy enough to handle:

hour=$(date +%H)

hour=${hour#0} # remove one leading zero, or
hour="10#$hour"    # make it base-10

The latter works even with more than one leading zero, but neither works
with a sign. So, handling numbers like '-00159' gets a bit annoying:


That is not an integer constant. Integer constants don't begin with `-'.
Bash uses the same definition for constants as the C standard, with the
addition of the `base#value' syntax.


At least from my point of view this isn't necessarily a bug, more like a 
feature request. The behaviour matches the description you just wrote, 
and also the documentation. That doesn't mean it's the only possible 
behaviour.


Changing the parsing here could be useful, and would improve 
compatibility with ksh and zsh. I don't think there's any alternative 
sensible meaning for  10#-0123  anyway, but I might be mistaken.



Since the `10#' notation is sufficient to deal with leading zeroes if you
want to force decimal, you only have to remove a leading unary plus or
minus.


Which I thought I just did, and Pierre provided a better way to do it 
(thanks).


Not having to splice the sign around would make this somewhat easier 
though, but YMMV.


--
Ilkka Virta / itvi...@iki.fi



Re: Word boundary anchors \< and \> not parsed correctly on the right side of =~

2018-07-10 Thread Ilkka Virta

On 10.7. 15:27, Greg Wooledge wrote:

On Mon, Jul 09, 2018 at 10:46:13PM -0300, marcelpa...@gmail.com wrote:

Word boundary anchors \< and \> are not parsed correctly on the right side of a 
=~ regex match expression.


Bash uses ERE (Extended Regular Expressions) here.  There is no \< or \>
in an ERE.


Or does it use the system's regex library, whatever that supports?

On my Linux systems, this prints 'y' (with Bash 4.4.12 and 4.1.2):
re='\' ; [[ "foo bar" =~ $re ]] && echo y


If '\<' matches just a regular less-than sign (but has a useless 
backslash), then surely that should not match?


That's the same example marcelpa...@gmail.com had, they didn't have
the <> signs in the string.

On my Mac, the above doesn't match. The same thing with a similar regex 
with \w .



http://pubs.opengroup.org/onlinepubs/9699919799/basedefs/V1_chap09.html#tag_09_04


This evaluates as false:

 [[ 'foo bar' =~ \ ]]


Well, of course it does, because \< is just a literal less-than sign
in a POSIX ERE.

wooledg:~$ re='\'
wooledg:~$ [[ '' =~ $re ]] && echo yes
yes

You might as well remove the backslashes, because they serve no purpose
here.  If you thought they meant "word boundary" or something, you're
in the wrong language.




--
Ilkka Virta / itvi...@iki.fi



Re: Number with sign is read as octal despite a leading 10#

2018-07-10 Thread Ilkka Virta

For what it's worth, ksh and zsh seem to interpret  10#-0159
as negative one-hundred and fifty-nine:

$ ksh -c 'for a do a="10#$a"; printf "%s " $((a + 1)); done; echo' \
  sh +159 +0159 -159 -0159
160 160 -158 -158
$ zsh -c 'for a do a="10#$a"; printf "%s " $((a + 1)); done; echo' \
  sh +159 +0159 -159 -0159
160 160 -158 -158

$ ksh --version
  version sh (AT Research) 93u+ 2012-08-01
$ zsh --version
zsh 5.3.1 (x86_64-debian-linux-gnu)


On 10.7. 13:44, Ilkka Virta wrote:
I think the problematic case here is when the number comes as input from 
some program, which might or might not print a leading sign or leading 
zeroes, but when we know that the number is, in any case, decimal.


E.g. 'date' prints leading zeroes, which is easy enough to handle:

hour=$(date +%H)

hour=${hour#0} # remove one leading zero, or
hour="10#$hour"    # make it base-10

The latter works even with more than one leading zero, but neither works 
with a sign. So, handling numbers like '-00159' gets a bit annoying:


$ num='-00159'
$ num="${num:0:1}10#${num:1}"; echo $(( num + 1 ))
-158

And that's without checking that the sign was there in the first place.


Something like that will probably not be too common, but an easier way 
to force any number to be interpreted in base-10 (regardless of leading 
zeroes) could be useful. If there is a way, I'd be happy to hear.



On 10.7. 04:37, Clint Hepner wrote:
The + is a unary operator, not part of the literal. Write 
$((+10#0034)) instead.


--
Clint
On Jul 9, 2018, 9:24 PM -0400, Isaac Marcos 
, wrote:

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
-DCONF_VENDOR
uname output: Linux IO 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1
(2018-05-07) x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.4
Patch Level: 12
Release Status: release

Description:
A value inside an arithmetic expansion is processed as octal despite 
using

a 10# preffix.

Repeat-By:
$ echo $((10#+0034))
28

Fix:
Extract optional sign before parsing the number, re-attach after.

--
Cases are always threesome:
Best case, Worst case, and Just in case






--
Ilkka Virta / itvi...@iki.fi



Re: Number with sign is read as octal despite a leading 10#

2018-07-10 Thread Ilkka Virta
I think the problematic case here is when the number comes as input from 
some program, which might or might not print a leading sign or leading 
zeroes, but when we know that the number is, in any case, decimal.


E.g. 'date' prints leading zeroes, which is easy enough to handle:

hour=$(date +%H)

hour=${hour#0} # remove one leading zero, or
hour="10#$hour"# make it base-10

The latter works even with more than one leading zero, but neither works 
with a sign. So, handling numbers like '-00159' gets a bit annoying:


$ num='-00159'
$ num="${num:0:1}10#${num:1}"; echo $(( num + 1 ))
-158

And that's without checking that the sign was there in the first place.


Something like that will probably not be too common, but an easier way 
to force any number to be interpreted in base-10 (regardless of leading 
zeroes) could be useful. If there is a way, I'd be happy to hear.



On 10.7. 04:37, Clint Hepner wrote:

The + is a unary operator, not part of the literal. Write $((+10#0034)) instead.

--
Clint
On Jul 9, 2018, 9:24 PM -0400, Isaac Marcos , 
wrote:

Configuration Information [Automatically generated, do not change]:
Machine: x86_64
OS: linux-gnu
Compiler: gcc
Compilation CFLAGS: -DPROGRAM='bash' -DCONF_HOSTTYPE='x86_64'
-DCONF_OSTYPE='linux-gnu' -DCONF_MACHTYPE='x86_64-pc-linux-gnu'
-DCONF_VENDOR
uname output: Linux IO 4.9.0-6-amd64 #1 SMP Debian 4.9.88-1+deb9u1
(2018-05-07) x86_64 GNU/Linux
Machine Type: x86_64-pc-linux-gnu

Bash Version: 4.4
Patch Level: 12
Release Status: release

Description:
A value inside an arithmetic expansion is processed as octal despite using
a 10# preffix.

Repeat-By:
$ echo $((10#+0034))
28

Fix:
Extract optional sign before parsing the number, re-attach after.

--
Cases are always threesome:
Best case, Worst case, and Just in case



--
Ilkka Virta / itvi...@iki.fi



  1   2   >