Jan, I've looked at the nodes and I think that the files (and in particular the oracle account) are damaged beyond repair. In order to reproduce this problem it is probably better to reinstall it all from scratch using the same OS and bits. I'll sync up with the lab and project memebers, and will get back to you.
many thanks for all this help! Fernando Jan Setje-Eilers wrote On 02/15/06 17:32,: > I spent some time with the machine that was printing: > >Requesting maintenance mode >(See /lib/svc/share/README for additional information.) >Requesting maintenance mode >(See /lib/svc/share/README for additional information.) >Requesting maintenance mode >(See /lib/svc/share/README for additional information.) > > instead of booting. > > First of all the root filesystem had some serious problems. I ran >fsck and cleaned up, but some log files, snmpd.conf and a backup copy >of the smf repository were lost. I re-made wtmpx. I also found a >bunch of missing shadow nods in /devices I re-made them, re-built the >devlinks and the system is now up and running again. > > For the extent of the ufs damage, please see the attached note. I did >re-run fsck until things were totally clean. Given the ufs damage, I'm >going to guess that oracle does not shut the machine down cleanly, but >we should confirm that by re-running your tests with a breakpoint set >in uadmin() or kadmin(). Every even remotely sane way down comes >through there, so this should let us know how the system is getting >there. > > However if it is tearing down the machine as aggressively as the >filesystem damage makes it appear, then we need to fix what oracle is >doing. > >-jan > > > > >------------------------------------------------------------------------ > >updating /dev/dsk/c2t0d0s0... >** /dev/rdsk/c2t0d0s0 >** Last Mounted on / >** Phase 1 - Check Blocks and Sizes >PARTIALLY TRUNCATED INODE I=190134 >SALVAGE? y > >1819044972 BAD I=190134 >1819044972 BAD I=190134 >1819044972 BAD I=190134 >1819044972 BAD I=190134 >1819044972 BAD I=190134 >1819044972 BAD I=190134 >1819044972 BAD I=190134 >1819044972 BAD I=190134 >1819044972 BAD I=190134 >1819044972 BAD I=190134 >EXCESSIVE BAD BLKS I=190134 >CONTINUE? y > >INCORRECT BLOCK COUNT I=190134 (640 should be 352) >CORRECT? y > >1620344 DUP I=190321 >1620345 DUP I=190321 >1620346 DUP I=190321 >1620347 DUP I=190321 >1620348 DUP I=190321 >1620349 DUP I=190321 >1620350 DUP I=190321 >1620351 DUP I=190321 >1620352 DUP I=190321 >1620353 DUP I=190321 >EXCESSIVE DUP BLKS I=190321 >CONTINUE? y > >INCORRECT BLOCK COUNT I=190321 (16464 should be 18) >CORRECT? y > >1620440 DUP I=190324 >1620441 DUP I=190324 >1867352 DUP I=215730 >1867360 DUP I=215730 >1867361 DUP I=215730 >1867362 DUP I=215730 >1867363 DUP I=215730 >1867364 DUP I=215730 >1867365 DUP I=215730 >1867366 DUP I=215730 >1867367 DUP I=215730 >1867368 DUP I=215730 >EXCESSIVE DUP BLKS I=215730 >CONTINUE? y > >INCORRECT BLOCK COUNT I=215730 (140 should be 80) >CORRECT? y > >** Phase 1b - Rescan For More DUPS >1620344 DUP I=7729 >1620345 DUP I=7729 >1620346 DUP I=7729 >1620347 DUP I=7729 >1620348 DUP I=7729 >1620349 DUP I=7729 >1620350 DUP I=7729 >1620351 DUP I=7729 >1620352 DUP I=190138 >1620440 DUP I=190138 >1620441 DUP I=190138 >1867360 DUP I=200646 >1867361 DUP I=200646 >1867362 DUP I=200646 >1867363 DUP I=200646 >1867364 DUP I=200646 >1867365 DUP I=200646 >1867366 DUP I=200646 >1867367 DUP I=200646 >1867352 DUP I=215729 >** Phase 2 - Check Pathnames >DUP/BAD I=190138 OWNER=root MODE=100600 >SIZE=2388992 MTIME=Feb 15 11:01 2006 >FILE=/etc/svc/repository-boot-20060215_110135 > >REMOVE? y > >DUP/BAD I=7729 OWNER=adm MODE=100644 >SIZE=66960 MTIME=Feb 15 12:23 2006 >FILE=/var/adm/wtmpx > >REMOVE? y > >UNALLOCATED I=190140 OWNER=root MODE=0 >SIZE=0 MTIME=Feb 15 12:23 2006 >NAME=/var/log/Xorg.0.log > >REMOVE? y > >DUP/BAD I=190134 OWNER=root MODE=100644 >SIZE=315392 MTIME=Feb 15 12:23 2006 >FILE=/var/sma_snmp/snmpd.conf > >REMOVE? y > >DUP/BAD I=200646 OWNER=5000 MODE=100640 >SIZE=16200 MTIME=Feb 15 12:16 2006 >FILE=/export/oracle/oracle/crs/evm/log/clear19_evmlog.20060215 > >REMOVE? y > >DUP/BAD I=215729 OWNER=5000 MODE=100644 >SIZE=5 MTIME=Feb 15 12:11 2006 >FILE=/export/oracle/oracle/crs/log/clear19/cssd/clear19.pid > >REMOVE? y > >DUP/BAD I=215730 OWNER=5000 MODE=100644 >SIZE=71002 MTIME=Feb 15 12:18 2006 >FILE=/export/oracle/oracle/crs/log/clear19/cssd/ocssd.log > >REMOVE? y > >DIRECTORY CORRUPTED I=262435 OWNER=5000 MODE=40750 >SIZE=512 MTIME=Feb 14 11:30 2006 >DIR=/export/oracle/oracle/db/rdbms/log > >SALVAGE? y > >MISSING '.' I=262435 OWNER=5000 MODE=40750 >SIZE=512 MTIME=Feb 14 11:30 2006 >DIR=/export/oracle/oracle/db/rdbms/log > >FIX? y > >MISSING '..' I=262435 OWNER=5000 MODE=40750 >SIZE=512 MTIME=Feb 14 11:30 2006 >DIR=/export/oracle/oracle/db/rdbms/log > >FIX? y > >UNALLOCATED I=269449 OWNER=root MODE=0 >SIZE=0 MTIME=Dec 31 16:00 1969 >NAME=? > >REMOVE? y > >UNALLOCATED I=269461 OWNER=root MODE=0 >SIZE=0 MTIME=Dec 31 16:00 1969 >NAME=? > >REMOVE? y > >UNALLOCATED I=269440 OWNER=root MODE=0 >SIZE=0 MTIME=Dec 31 16:00 1969 >NAME=? > >REMOVE? y > >** Phase 3 - Check Connectivity >UNREF DIR I=269361 OWNER=5000 MODE=40755 >SIZE=512 MTIME=Jan 30 14:41 2006 >RECONNECT? y > >DIR I=269361 CONNECTED. PARENT WAS I=190145 > >UNREF DIRECTORY I=198874 OWNER=5000 MODE=40770 >SIZE=512 MTIME=Feb 14 11:09 2006 >CLEAR? y > >UNREF DIRECTORY I=198257 OWNER=5000 MODE=40770 >SIZE=512 MTIME=Feb 14 11:09 2006 >CLEAR? y > >** Phase 4 - Check Reference Counts >BAD/DUP FILE I=7729 OWNER=adm MODE=100644 >SIZE=66960 MTIME=Feb 15 12:23 2006 >CLEAR? y > >UNREF FILE I=7764 OWNER=root MODE=100644 >SIZE=275 MTIME=Jan 15 00:22 2003 >CLEAR? y > >BAD/DUP FILE I=190134 OWNER=root MODE=100644 >SIZE=315392 MTIME=Feb 15 12:23 2006 >CLEAR? y > >UNREF FILE I=190135 OWNER=root MODE=100600 >SIZE=306 MTIME=Feb 15 12:23 2006 >RECONNECT? y > >BAD/DUP FILE I=190138 OWNER=root MODE=100600 >SIZE=2388992 MTIME=Feb 15 11:01 2006 >CLEAR? y > >UNREF FILE I=190320 OWNER=root MODE=20620 >SIZE=0 MTIME=Feb 14 15:02 2006 >CLEAR? y > >BAD/DUP FILE I=190321 OWNER=5000 MODE=100644 >SIZE=8414720 MTIME=Feb 14 15:03 2006 >CLEAR? y > >UNREF FILE I=190322 OWNER=5000 MODE=100644 >SIZE=166 MTIME=Jan 13 15:15 2006 >RECONNECT? y > >UNREF FILE I=190323 OWNER=5000 MODE=100644 >SIZE=9 MTIME=Jan 6 09:50 2006 >RECONNECT? y > >BAD/DUP FILE I=190324 OWNER=5000 MODE=100644 >SIZE=1929 MTIME=Feb 1 17:25 2006 >CLEAR? y > >BAD/DUP FILE I=200646 OWNER=5000 MODE=100640 >SIZE=16200 MTIME=Feb 15 12:16 2006 >CLEAR? y > >BAD/DUP FILE I=215729 OWNER=5000 MODE=100644 >SIZE=5 MTIME=Feb 15 12:11 2006 >CLEAR? y > >BAD/DUP FILE I=215730 OWNER=5000 MODE=100644 >SIZE=71002 MTIME=Feb 15 12:18 2006 >CLEAR? y > >LINK COUNT DIR I=269361 OWNER=5000 MODE=40755 >SIZE=512 MTIME=Jan 30 14:41 2006 COUNT 7 SHOULD BE 5 >ADJUST? y > >LINK COUNT DIR I=269408 OWNER=5000 MODE=40755 >SIZE=512 MTIME=Jan 26 13:35 2006 COUNT 6 SHOULD BE 5 >ADJUST? y > >** Phase 5 - Check Cyl groups >FREE BLK COUNT(S) WRONG IN SUPERBLK >SALVAGE? y > >185171 files, 5962958 used, 60079364 free (196308 frags, 7485382 blocks, 0.3% >fragmentation) > >***** FILE SYSTEM WAS MODIFIED ***** ># > > > >------------------------------------------------------------------------ > >_______________________________________________ >smf-discuss mailing list >smf-discuss at opensolaris.org > > -- <http://www.sun.com> * Fernando Castano * Staff engineer, MDE *Sun Microsystems, Inc.* 260 Constitution Drive Menlo Park, CA 94025 US Phone x88904/+1 650 786 8904 Email Fernando.Castano at Sun.COM -------------- next part -------------- An HTML attachment was scrubbed... URL: <http://mail.opensolaris.org/pipermail/smf-discuss/attachments/20060216/66353e49/attachment.html>