Solaris 11.4, SPARC-M8, 1 coreutils fail  (cfarm216)

We've an issue due to the recent posix_spawn() changes:

  /tests/install/basic-1.sh: line 59:
  ginstall $strip -c -m 555 $dd $dir
  154 Bus Error               (core dumped)

It's 100% reproducible with:
 gmake TESTS=tests/install/basic-1.sh SUBDIRS=. check

Putting truss in the test shows the failure after vfork:

vforkx(0)                                       = 26169
lwp_sigmask(SIG_SETMASK, 0x00000000, 0x00000000, 0x00000000, 0x00000000) = 
0xFFBFFEFF [0xFFFFFFFF]
    Incurred fault #5, FLTACCESS  %pc = 0x100022468
      siginfo: SIGBUS BUS_ADRALN addr=0xFFFF2F682F68772C
    Received signal #10, SIGBUS [default]
      siginfo: SIGBUS BUS_ADRALN addr=0xFFFF2F682F68772C

Notes:
 * posix_spawnp() is being replaced on this system
   (because gl_cv_func_posix_spawnp_secure_exec=no)
 * If I set flags to NULL so that less is run after the vfork, the error still 
occurs?
   I suspect the "child" running getenv("PATH") etc. rather than just execvp() 
is triggering this
 * If I run the ginstall command from the shell it's OK
   I've not debugged why there is different behavior here.
 * If I force ifdef out the VFORK call in gnulib's spawni it's OK
 * If I force the use of the system posix_spawnp it's Ok. Tested with:
   ./configure gl_cv_func_posix_spawnp_secure_exec=yes --quiet && gmake -j8
   gmake TESTS=tests/install/basic-1.sh SUBDIRS=. check

BTW Collin, when looking at this I think the use of POSIX_SPAWN_USEVFORK
is too aggressive, and might trigger other bugs on older glibc at least.
It's ok to be slower on these older systems
(glibc 2.24 was released in 2016 after all).

To avoid ths issue (if we don't directly fix the bug) we could force
gl_cv_func_posix_spawnp_secure_exec=yes. Since coreutils was previously using
execvp() (which has the /bin/sh fallback) we'd be no worse in this regard at 
least.

Alternatively coreutils could force gnulib not to use vfork at all,
so we'd have the slower (but more secure) behavior as a fallback.
One could do this like:

diff --git a/configure.ac b/configure.ac
index 5e99ef386..5f59b91fc 100644
--- a/configure.ac
+++ b/configure.ac
@@ -60,6 +60,7 @@ AC_PROG_EGREP
 AC_PROG_LN_S
 gl_EARLY
 gl_SET_CRYPTO_CHECK_DEFAULT([auto-gpl-compat])
+AC_CACHE_VAL([ac_cv_func_vfork], [ac_cv_func_vfork=no])
 gl_INIT
 coreutils_MACROS

If we forced both ac_cv_func_vfork=no and 
gl_cv_func_posix_spawnp_secure_exec=yes
we'd have the same security as on <= 9.8, but faster operation on Solaris 11 
etc.

cheers,
Padraig

Reply via email to