Hi! ----
A few days ago Mike Kupfer found a crash while trying to build the ksh93-integration prototype on an AMD64 machine. The "make install" target in usr/src/cmd/ksh/ runs the ksh93/AST test suite which crashed in one of the test runs. The crash sequence looks like this: -- snip -- $ cd usr/src/cmd/ksh/amd64 $ make install [snip] # which ksh='/home/test001/ksh93/on_build1/test1_x86/usr/src/cmd/ksh/amd64/ksh', ksh93='/home/test001/ksh93/on_build1/test1_x86/usr/src/cmd/ksh/amd64/ksh93' ## Running ksh test: LANG='C' script='alias.sh' ## Running ksh test: LANG='C' script='append.sh' ## Running ksh test: LANG='C' script='arith.sh' ksh[18]: 24269 Segmentation Fault(coredump) *** Error code 139 -- snip -- The stack trace looks like this: -- snip -- $ dbx - core Corefile specified executable: "/home/test001/ksh93/on_build1/test1_x86/proto/root_i386/usr/bin/amd64/ksh93" For information about new features see `help changes' To remove this message, put `dbxenv suppress_startup_message 7.4' in your .dbxrc Reading ksh93 core file header read successfully Reading ld.so.1 Reading libshell.so.1 Reading libc.so.1 Reading libcmd.so.1 Reading libdll.so.1 Reading libast.so.1 Reading libsecdb.so.1 Reading libm.so.2 Reading libsocket.so.1 Reading libnsl.so.1 program terminated by signal SEGV (no mapping at the fault address) 0xfffffd7fff357eae: expr+0x004e: cmpl 0x000000000003ef1f(%rbx),%r8d (dbx) where =>[1] expr(0xfffffd7fffdfe948, 0x1f, 0x28, 0xfffffd7fff37048e, 0xfffffd7fff357ed2, 0x7), at 0xfffffd7fff357eae [2] arith_compile(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff359136 [3] sh_arithcomp(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff3271cf [4] getanode(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34d236 [5] item(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34f1d5 [6] term(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34da7f [7] list(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34d9b4 [8] sh_cmd(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34d888 [9] item(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34f12d [10] term(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34da7f [11] list(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34d9b4 [12] sh_cmd(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34d888 [13] sh_parse(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff34d530 [14] exfile(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff3235ff [15] sh_main(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0xfffffd7fff322f22 [16] main(0x0, 0x0, 0x0, 0x0, 0x0, 0x0), at 0x4009d5 -- snip -- My compiler version is: -- snip -- $ cc -V cc: Sun C 5.7 Patch 117837-04 2005/05/11 usage: cc [ options] files. Use 'cc -flags' for details $ CC -V CC: Sun C++ 5.7 Patch 117831-02 2005/03/30 $ uname -a SunOS hal-9000 5.11 snv_43 i86pc i386 i86pc SunOS -- snip -- I suspect this may be a compiler bug in Sun Studio 10 (CC:'ing "Christopher D. Quenelle" <Chris.Quenelle at Sun.COM>) since this issue is AMD64-specific, neither any of the 32bit binaries (i86, sparc) nor the 64bit sparcv9 binary show this problem and turning the optimisation off cures the problem as described in the commit for the workaround (see http://polaris.blastwave.org/changeset/391). Since then a refined version of the workaround has been applied (http://polaris.blastwave.org/changeset/400) which narrows-down the problem to the responding source file. * Steps to reproduce: 1. Pull sources and extract closed bin stuff: $ svn checkout -r 400 svn://svn.genunix.org/on/branches/ksh93/gisburn/prototype002/m1_ast_ast_imported/usr $ bzcat ../download/on-closed-bins-b37.i386.tar.bz2 | tar -xf - 2. Run "bldenv": $ cd .. ; env - SHELL=$SHELL TERM=$TERM HOME=$HOME LOGNAME=$LOGNAME DISPLAY=$DISPLAY LANG=C LC_ALL=C PAGER=less MANPATH=$MANPATH /opt/onbld/bin/bldenv opensolaris.sh 3. Build it (the quick way): $ cd test_x86/usr/src $ time nice make setup 2>&1 | tee -a buildlog_setup.log $ time nice dmake install >buildlog.log 2>&1 4. Backout workaround added with http://polaris.blastwave.org/changeset/400 and http://polaris.blastwave.org/changeset/391 5. Recompile libshell: $ cd lib/libshell ; make install 6a. Run tests (normal way): $ cd ../../cmd/ksh/amd64 ; make install OR 6b. Run tests manually (NOT RECOMMENDED): % (LD_LIBRARY_PATH=$ROOT/lib/amd64 $ROOT/usr/bin/amd64/ksh93 $SRC/lib/libshell/common/tests/arith.sh ) I uploaded a sample core dump from a snv_43 build machine to http://www.opensolaris.org/os/project/ksh93-integration/downloads/ksh93_integration_20060821_amd64_crash_in_arith_sh.core.bz2 (MD5(ksh93_integration_20060821_amd64_crash_in_arith_sh.core.bz2)= 10357a20d58ff17132de914e10896515) for further analysis by someone who may know AMD64 assembler better than I do... ... help/suggestions/comments/rants/etc. welcome... :-) ---- Bye, Roland -- __ . . __ (o.\ \/ /.o) roland.mainz at nrubsig.org \__\/\/__/ MPEG specialist, C&&JAVA&&Sun&&Unix programmer /O /==\ O\ TEL +49 641 7950090 (;O/ \/ \O;)