Re: [PATCH] unset/null variables should expand to 0 during arithmetic expansion

2017-02-01 Thread Michael Greenberg
Hi Martijn,

> On Jan 29, 2017, at 4:13 PM, Martijn Dekker <mart...@inlv.org> wrote:
> 
> Op 19-01-17 om 21:06 schreef Michael Greenberg:
>> unset x ; echo $((x+=2))
>> 
>> Running bash on this program echoes the number 2 to standard out and sets x 
>> to 2. Running dash (git HEAD/release 0.5.9.1) yields an error:
>> 
>> src/dash: 1: Illegal number: 
> 
> Yes, looks like a dash-specific bug. The related shells Busybox ash and
> FreeBSD /bin/sh act like POSIX says they should.

Great! Please pardon my naivete, but is there anything else I can or should do 
to see that this bug patch gets accepted and applied?

>> The standard is a little bit unclear about what to do for variables that are 
>> set to null (or non-numeric values). I’d interpret the standard as 
>> defaulting for unset variables, having null and other non-numeric strings 
>> cause some kind of error. bash, as usual, goes well beyond what the standard 
>> indicates: any non-numerical value is defaulted to 0. So:
>> 
>> x="" ; echo $((x+=2))
>> 
>> x=“yo” ; echo $((x+=2))
>> 
>> both yield 2! If you ask me, that’s a bridge too far—and asking for bugs.
> 
> That's recursive arithmetic expression parsing in bash, ksh and zsh; if
> the value of a variable is an arithmetic expression (including simply
> another variable name, such as "yo") this is evaluated, which in this
> case means the value of "x" is taken as the value of "yo" which is taken
> as zero. That behaviour is not specified by POSIX, of course.

Right. I only mentioned it because I—as I think we both agree—this behavior is 
underspecified and probably good as-is. :)

Cheers,
Michael

[PATCH] unset/null variables should expand to 0 during arithmetic expansion

2017-01-19 Thread Michael Greenberg
Hi,

It seems like dash has a subtly incorrect arithmetic expansion for variables 
that are unset. For example, consider the following program:

unset x ; echo $((x+=2))

Running bash on this program echoes the number 2 to standard out and sets x to 
2. Running dash (git HEAD/release 0.5.9.1) yields an error:

src/dash: 1: Illegal number: 

Now, bash and dash disagreeing isn’t such a big deal. But it seems like the 
standard indicates bash’s defaulting is correct for unset variables. According 
to 
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html#tag_18_06_04>,
 “the arithmetic expression shall be processed according to the rules given in 
Arithmetic Precision and Operations”, which says “all variables shall be 
initialized to zero if they are not otherwise assigned by the input to the 
application” 
<http://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap01.html#tag_17_01_02_01>.

The standard is a little bit unclear about what to do for variables that are 
set to null (or non-numeric values). I’d interpret the standard as defaulting 
for unset variables, having null and other non-numeric strings cause some kind 
of error. bash, as usual, goes well beyond what the standard indicates: any 
non-numerical value is defaulted to 0. So:

x="" ; echo $((x+=2))

x=“yo” ; echo $((x+=2))

both yield 2! If you ask me, that’s a bridge too far—and asking for bugs.

I should add that my student (Austin Blatt, CC-ed) noticed this issue; I wrote 
the attached, single-line fix.

Cheers,
Michael

Signed-off-by: Michael Greenberg <michael.greenb...@pomona.edu>
—
diff --git a/src/var.c b/src/var.c
index cc6f7f2..e34f9cf 100644
--- a/src/var.c
+++ b/src/var.c
@@ -353,7 +353,7 @@ lookupvar(const char *name)
 
 intmax_t lookupvarint(const char *name)
 {
-   return atomax(lookupvar(name) ?: nullstr, 0);
+   return atomax(lookupvar(name) ?: "0", 0);
 }
 
 

 




Re: [PATCH] Revert 78a00a7 and 3cd5386, they are not ready yet.

2018-11-15 Thread Michael Greenberg
Seconded! A test suite for dash would have saved me a great deal of 
trouble. I have a number of tests I've written that I'd be happy to 
contribute.

Is dash ever tested on the POSIX test suite? Certification is expensive, 
but their test suite seems to be quite thorough... and it may even be 
possible to use their tests even if dash isn't going to be officially 
POSIX compliant.

Cheers,
Michael

On 2018-11-15 9:03 AM, Devin Hussey wrote:
>  From 65f9c258658b50aeb67947f5b54e926538560488 Mon Sep 17 00:00:00 2001
> From: Devin Hussey 
> Date: Thu, 15 Nov 2018 08:29:33 -0500
> Subject: [PATCH] Revert 78a00a7 and 3cd5386, they are not ready yet.
>
> They are not ready, as simply running dash on its own configure
> script will reveal (note: config.cache does not exist):
>
>  checking for a BSD-compatible install... (cached)
>  checking whether build environment is sane... yes
>  configure: 1: eval: --is-lightweight: not found
>  configure: WARNING: 'missing' script is too old or missing
>  checking for a thread-safe mkdir -p... (cached)  -p
>  checking for gawk... (cached) no
>  checking for mawk... (cached) no
>  checking for nawk... (cached) no
>  checking for awk... (cached) no
>  checking whether make sets $(MAKE)... (cached) configure: 1: test:
> =: unexpected operator
>  no
>  checking whether make supports nested variables... (cached)
>  configure: 2878: test: =: unexpected operator
>  checking for gcc... (cached) no
>  checking for cc... (cached) no
>  checking for cl.exe... (cached) no
>  configure: error: in `/Users/Devin/dash2':
>  configure: error: no acceptable C compiler found in $PATH
>  See `config.log' for more details
>
> 3cd5386 completely broke expansion.
>
> I got the configure script to run by changing
>
>  return p - 1;
>
> to
>
>  return p;
>
> in argstr, but variable expansion is still broken.
>
> The most broken thing is ${FOO+BAR}:
>
>  $ bash -c 'unset FOO; echo ${FOO+BAR}'
>
>  $ ./dash -c 'unset FOO; echo ${FOO+BAR}'
>  BAR
>  $
>
> We should really set up a test suite, as that isn't the first
> time some obvious bugs have made their way into the code base.
> I wonder if FreeBSD will let us use their tests, as they have
> a very (too?) expansive set of tests.
>
> At the very least, we should make sure dash can run its own
> configure script, as that is the most common shell script
> to run.
>
> Signed-off-by: Devin Hussey 
> ---
>   src/expand.c | 321 ---
>   src/expand.h |   2 +-
>   2 files changed, 176 insertions(+), 147 deletions(-)
>
> diff --git a/src/expand.c b/src/expand.c
> index 14daa63..7a51766 100644
> --- a/src/expand.c
> +++ b/src/expand.c
> @@ -110,13 +110,13 @@ static struct ifsregion *ifslastp;
>   /* holds expanded arg list */
>   static struct arglist exparg;
>
> -static char *argstr(char *p, int flag);
> -static char *exptilde(char *startp, int flag);
> -static char *expari(char *start, int flag);
> +STATIC void argstr(char *, int);
> +STATIC char *exptilde(char *, char *, int);
>   STATIC void expbackq(union node *, int);
> +STATIC const char *subevalvar(char *, char *, int, int, int, int, int);
>   STATIC char *evalvar(char *, int);
>   static size_t strtodest(const char *p, int flags);
> -static size_t memtodest(const char *p, size_t len, int flags);
> +static void memtodest(const char *p, size_t len, int flags);
>   STATIC ssize_t varvalue(char *, int, int, int);
>   STATIC void expandmeta(struct strlist *, int);
>   #ifdef HAVE_GLOB
> @@ -133,7 +133,7 @@ STATIC int pmatch(const char *, const char *);
>   #else
>   #define pmatch(a, b) !fnmatch((a), (b), 0)
>   #endif
> -static size_t cvtnum(intmax_t num, int flags);
> +STATIC int cvtnum(intmax_t);
>   STATIC size_t esclen(const char *, const char *);
>   STATIC char *scanleft(char *, char *, char *, char *, int, int);
>   STATIC char *scanright(char *, char *, char *, char *, int, int);
> @@ -192,11 +192,13 @@ expandarg(union node *arg, struct arglist
> *arglist, int flag)
>   argbackq = arg->narg.backquote;
>   STARTSTACKSTR(expdest);
>   argstr(arg->narg.text, flag);
> +p = _STPUTC('\0', expdest);
> +expdest = p - 1;
>   if (arglist == NULL) {
>   /* here document expanded */
>   goto out;
>   }
> -p = grabstackstr(expdest);
> +p = grabstackstr(p);
>   exparg.lastp = 
>   /*
>* TODO - EXP_REDIR
> @@ -230,7 +232,8 @@ out:
>* $@ like $* since no splitting will be performed.
>*/
>
> -static char *argstr(char *p, int flag)
> +STATIC void
> +argstr(char *p, int flag)
>   {
>   static const char spclchars[] = {
>   '=',
> @@ -240,7 +243,6 @@ static char *argstr(char *p, int flag)
>   CTLESC,
>   CTLVAR,
>   CTLBACKQ,
> -CTLARI,
>   CTLENDARI,
>   0
>   };
> @@ -251,41 +253,35 @@ static char *argstr(char *p, int flag)

[PATCH] Bug in times builtin

2019-06-10 Thread Michael Greenberg
Hi all,

The `times` builtin in dash fails to compute the seconds part
properly. To see the bug, I ran the following script:

```sh
delay() {
case "$#" in
( 0 ) secs=1 ;;
( 1 ) secs=$(($1+0)) ;;
( * ) exit 2 ;;
esac

start=$(date "+%s")
while now=$(date "+%s") ;
  [ $((now - start)) -lt "$secs" ]
do
:
done
}

times
delay $((5 * 60))
times
```

The script implements a 5min sleep. An unpatched dash produces the
following output:

```
0m0.00s 0m0.00s
0m0.00s 0m0.00s
0m8.33s 0m26.69s
1m97.67s 1m107.15s
```

With the patch, the seconds are computed correctly (NB that this is a
different run so times vary slightly):

```
0m0.00s 0m0.00s
0m0.00s 0m0.00s
0m8.26s 0m27.05s
1m38.13s 1m45.96s
```

If folks are opposed to including math.h for some reason, I'm sure the
computation could be done another way.

Cheers,
Michael

Signed-off-by: Michael Greenberg 
diff --git a/src/bltin/times.c b/src/bltin/times.c
index 8eabc1f..7e66f90 100644
--- a/src/bltin/times.c
+++ b/src/bltin/times.c
@@ -3,6 +3,7 @@
  * This file contains code for the times builtin.
  */
 
+#include 
 #include 
 #include 
 #ifdef USE_GLIBC_STDIO
@@ -19,12 +20,12 @@ int timescmd() {
times();
printf("%dm%fs %dm%fs\n%dm%fs %dm%fs\n",
   (int) (buf.tms_utime / clk_tck / 60),
-  ((double) buf.tms_utime) / clk_tck,
+  fmod(((double) buf.tms_utime) / clk_tck, 60),
   (int) (buf.tms_stime / clk_tck / 60),
-  ((double) buf.tms_stime) / clk_tck,
+  fmod(((double) buf.tms_stime) / clk_tck, 60),
   (int) (buf.tms_cutime / clk_tck / 60),
-  ((double) buf.tms_cutime) / clk_tck,
+  fmod(((double) buf.tms_cutime) / clk_tck, 60),
   (int) (buf.tms_cstime / clk_tck / 60),
-  ((double) buf.tms_cstime) / clk_tck);
+  fmod(((double) buf.tms_cstime) / clk_tck, 60));
return 0;
 }


Superseded patch?

2019-06-10 Thread Michael Greenberg
Hi all,

I just noticed that an old patch of mine is marked as 'superseded' in
patchwork . Was this a
conscious decision not to fix the POSIX noncompliance here, an accident,
or something else? Should I resubmit the patch?

Cheers,
Michael


[PATCH] alias: Fix handling of empty aliases

2019-05-09 Thread Michael Greenberg
Dash was incorrectly handling empty aliases. When attempting to use an
empty alias with nothing else, I'm (incorrectly) prompted for more
input:

```
$ alias empty=''
$ empty
>
```

Other shells (e.g., bash, yash) correctly handle the lone, empty alias as an
empty command:

```
$ alias empty=''
$ empty
$
```

This patch fixes the parser to handle the case where an alias is empty,
i.e., produces no token.

Signed-off-by: Michael Greenberg 
diff --git a/src/parser.c b/src/parser.c
index 1f9e8ec..a1d6116 100644
--- a/src/parser.c
+++ b/src/parser.c
@@ -470,6 +470,7 @@ next_case:
break;
case TWORD:
case TREDIR:
+   case TNL: /* necessary for empty aliases */
tokpushback++;
return simplecmd();
}
@@ -717,6 +718,7 @@ top:
}
}
 
+ignorenl: /* empty alias? */
if (t != TWORD || quoteflag) {
goto out;
}
@@ -739,8 +741,11 @@ top:
if ((ap = lookupalias(wordtext, 1)) != NULL) {
if (*ap->val) {
pushstring(ap->val, ap);
+   goto top;
+   } else {
+   t = xxreadtoken();
+   goto ignorenl;
}
-   goto top;
}
}
 out:



[v2 PATCH] mktokens relative TMPDIR

2020-04-29 Thread Michael Greenberg
The mktokens script fails when /tmp isn't writable (e.g., when building
in a sandbox with a different TMPDIR). Replace absolute references to
/tmp to relative references to TMPDIR. If TMPDIR is unset or null,
default to /tmp.

The mkbuiltins script was already hardened to work relative to TMPDIR,
also defaulting to /tmp.

v2 ensures that TMPDIR is quoted.

Signed-off-by: Michael Greenberg 

diff --git a/src/mktokens b/src/mktokens
index cd52241..3ab7bc5 100644
--- a/src/mktokens
+++ b/src/mktokens
@@ -37,7 +37,9 @@
 # token marks the end of a list.  The third column is the name to print in
 # error messages.

-cat > /tmp/ka$$ <<\!
+: ${TMPDIR:=/tmp}
+
+cat > "${TMPDIR}"/ka$$ <<\!
 TEOF   1   end of file
 TNL0   newline
 TSEMI  0   ";"
@@ -68,28 +70,28 @@ TWHILE  0   "while"
 TBEGIN 0   "{"
 TEND   1   "}"
 !
-nl=`wc -l /tmp/ka$$`
+nl=`wc -l "${TMPDIR}"/ka$$`
 exec > token.h
-awk '{print "#define " $1 " " NR-1}' /tmp/ka$$
+awk '{print "#define " $1 " " NR-1}' "${TMPDIR}"/ka$$

 exec > token_vars.h

 echo '
 /* Array indicating which tokens mark the end of a list */
 static const char tokendlist[] = {'
-awk '{print "\t" $2 ","}' /tmp/ka$$
+awk '{print "\t" $2 ","}' "${TMPDIR}"/ka$$
 echo '};

 static const char *const tokname[] = {'
 sed -e 's/"/\\"/g' \
 -e 's/[^]*[ ][  ]*[^]*[ ][  ]*\(.*\)/  "\1",/' 
\
-/tmp/ka$$
+"${TMPDIR}"/ka$$
 echo '};
 '
-sed 's/"//g' /tmp/ka$$ | awk '
+sed 's/"//g' "${TMPDIR}"/ka$$ | awk '
 /TNOT/{print "#define KWDOFFSET " NR-1; print ""; 
   print "static const char *const parsekwd[] = {"}
 /TNOT/,/neverfound/{if (last) print "  \"" last "\","; last = $3}
 END{print "\"" last "\"\n};"}'

-rm /tmp/ka$$
+rm "${TMPDIR}"/ka$$


Re: [BUG] one-shot variables aren't propagated past functions

2020-04-30 Thread Michael Greenberg
On 2020-04-30 06:00:01, Denton Liu  wrote:

> Hi all,
>
> I believe that I've encountered a bug in dash 0.5.10.2. With the
> following input
>
> f () {
> echo $var
> }
>
> var=test f
> echo $var
>
> I would expect the output to be
>
> test
> test
>
> but the output is actually
>
> test
> 
>
> It seems like this violates the POSIX spec.

Junio's reading of the spec isn't quite right. From Section 2.9.1
"Simple Commands":

  If the command name is a function that is not a standard utility
  implemented as a function, variable assignments shall affect the
  current execution environment during the execution of the function. It
  is unspecified:

  - Whether or not the variable assignments persist after the completion
of the function

  - Whether or not the variables gain the export attribute during the
execution of the function

  - Whether or not export attributes gained as a result of the variable
assignments persist after the completion of the function (if
variable assignments persist after the completion of the function)

I've seen only two ways that shells treat simple commands of the form
`VAR=VAL FUNC ARGS ...`. Shells with support for the (unspecified)
`local` command tend to have the behavior dash does, where the
assignment of `VAR` is local in scope. Other shells treat such
assignments globally.

You can find a summary of this issue in my recent paper from POPL 2020,
"Executable formal semantics for the POSIX shell", Section
8.3. (https://mgree.github.io/papers/popl2020_smoosh.pdf)

Cheers,
Michael


Quoting Junio C Hamano[1]:
>> Johannes Sixt  writes:
>>
>> > Tarmigan Casebolt schrieb:
>> >> REQUEST_METHOD="GET" some_shell_function
>> >
>> >> I can't tell from my reading of the POSIX spec whether my usage was
>> >> wrong or if dash is wrong,
>> >
>> > According to POSIX, variables set as shown above for shell functions are
>> > not exported and retain their value after the function returns.
>>
>> I actually looked for this yesterday, but didn't find a relevant
>> definition.  But "2.9.5 Function Definition Command" [*1*] seems to
>> address the issue: "When a function is executed, it shall have the
>> syntax-error and variable-assignment properties described for special
>> built-in utilities...".
>>
>> And "2.14 Special Built-in Utilities" section [*2*] says "2. Variable
>> assignments specified with special built-in utilities remain in effect
>> after the built-in completes...".  Taking both together, it seems that
>> the assignment should be in effect after the function returns.
>>
> [...]
>>
>> [References]
>>
>> *1* 
>> https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.opengroup.org%2Fonlinepubs%2F9699919799%2Futilities%2FV3_chap02.html%23tag_18_09_05data=02%7C01%7CMichael.Greenberg%40pomona.edu%7Cb8c8aa7c728f4110ab8208d7eced43f4%7C817f590439044ee8b3a5a65d4746ff70%7C0%7C0%7C637238376114210837sdata=ZDuZDf%2B2fbEbehYK9DpFmHHhtFwEjUh51pQwdexANdk%3Dreserved=0
>> *2* 
>> https://nam01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fwww.opengroup.org%2Fonlinepubs%2F9699919799%2Futilities%2FV3_chap02.html%23tag_18_14data=02%7C01%7CMichael.Greenberg%40pomona.edu%7Cb8c8aa7c728f4110ab8208d7eced43f4%7C817f590439044ee8b3a5a65d4746ff70%7C0%7C0%7C637238376114210837sdata=2NGIBN2kWLb%2FwZRy5lh021vUd%2FVLXEh2IzNq0lpv0R4%3Dreserved=0
>
> It also seems like on bash 5.0.16, running with --posix produces the
> expected output, although without --posix it produces the blank line.
>
> Originally discussed here[2].




> Thanks,
>
> Denton
>
> [1]: 
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fgit%2F7vljfzz0yd.fsf%40alter.siamese.dyndns.org%2Fdata=02%7C01%7CMichael.Greenberg%40pomona.edu%7Cb8c8aa7c728f4110ab8208d7eced43f4%7C817f590439044ee8b3a5a65d4746ff70%7C0%7C0%7C637238376114210837sdata=bQcTgwa%2Bb%2FobsCT5L2laRqP%2BmhJQNJgCHllsVHNcwhw%3Dreserved=0
> [2]: 
> https://nam01.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore.kernel.org%2Fgit%2F20200430092220.GA25331%40generichostname%2Fdata=02%7C01%7CMichael.Greenberg%40pomona.edu%7Cb8c8aa7c728f4110ab8208d7eced43f4%7C817f590439044ee8b3a5a65d4746ff70%7C0%7C0%7C637238376114220834sdata=cEpyKOzR6UB%2Bvdr%2FOK7zsex1aRj7Fqyng5ZpH9ZDvSc%3Dreserved=0
> 
>
> [EXTERNAL EMAIL] Exercise caution before clicking on links or opening 
> attachments.


[v3 PATCH] mktokens relative TMPDIR

2020-04-29 Thread Michael Greenberg
The mktokens script fails when /tmp isn't writable (e.g., when building
in a sandbox with a different TMPDIR). Replace absolute references to
/tmp to relative references to TMPDIR. If TMPDIR is unset or null,
default to /tmp.

The mkbuiltins script was already hardened to work relative to TMPDIR,
also defaulting to /tmp.

v2 ensures that TMPDIR is quoted.
v3 adds an extra quotation that prevents extra pathname expansions.

Signed-off-by: Michael Greenberg 

diff --git a/src/mktokens b/src/mktokens
index cd52241..3ab7bc5 100644
--- a/src/mktokens
+++ b/src/mktokens
@@ -37,7 +37,9 @@
 # token marks the end of a list.  The third column is the name to print in
 # error messages.

-cat > /tmp/ka$$ <<\!
+: "${TMPDIR:=/tmp}"
+
+cat > "${TMPDIR}"/ka$$ <<\!
 TEOF   1   end of file
 TNL0   newline
 TSEMI  0   ";"
@@ -68,28 +70,28 @@ TWHILE  0   "while"
 TBEGIN 0   "{"
 TEND   1   "}"
 !
-nl=`wc -l /tmp/ka$$`
+nl=`wc -l "${TMPDIR}"/ka$$`
 exec > token.h
-awk '{print "#define " $1 " " NR-1}' /tmp/ka$$
+awk '{print "#define " $1 " " NR-1}' "${TMPDIR}"/ka$$

 exec > token_vars.h

 echo '
 /* Array indicating which tokens mark the end of a list */
 static const char tokendlist[] = {'
-awk '{print "\t" $2 ","}' /tmp/ka$$
+awk '{print "\t" $2 ","}' "${TMPDIR}"/ka$$
 echo '};

 static const char *const tokname[] = {'
 sed -e 's/"/\\"/g' \
 -e 's/[^]*[ ][  ]*[^]*[ ][  ]*\(.*\)/  "\1",/' 
\
-/tmp/ka$$
+"${TMPDIR}"/ka$$
 echo '};
 '
-sed 's/"//g' /tmp/ka$$ | awk '
+sed 's/"//g' "${TMPDIR}"/ka$$ | awk '
 /TNOT/{print "#define KWDOFFSET " NR-1; print ""; 
   print "static const char *const parsekwd[] = {"}
 /TNOT/,/neverfound/{if (last) print "  \"" last "\","; last = $3}
 END{print "\"" last "\"\n};"}'

-rm /tmp/ka$$
+rm "${TMPDIR}"/ka$$