Jan,

  I've looked at the nodes and I think that the files (and in particular
the oracle account) are damaged beyond repair.  In order to reproduce
this problem it is probably better to reinstall it all from scratch
using the same OS and bits.  I'll sync up with the lab and project
memebers, and will get back to you.

  many thanks for all this help!

   Fernando

Jan Setje-Eilers wrote On 02/15/06 17:32,:

> I spent some time with the machine that was printing:
>
>Requesting maintenance mode
>(See /lib/svc/share/README for additional information.)
>Requesting maintenance mode
>(See /lib/svc/share/README for additional information.)
>Requesting maintenance mode
>(See /lib/svc/share/README for additional information.)
>
> instead of booting.
>
> First of all the root filesystem had some serious problems. I ran
>fsck and cleaned up, but some log files, snmpd.conf and a backup copy
>of the smf repository were lost. I re-made wtmpx. I also found a 
>bunch of missing shadow nods in /devices I re-made them, re-built the 
>devlinks and the system is now up and running again.
>
> For the extent of the ufs damage, please see the attached note. I did
>re-run fsck until things were totally clean. Given the ufs damage, I'm
>going to guess that oracle does not shut the machine down cleanly, but
>we should confirm that by re-running your tests with a breakpoint set
>in uadmin() or kadmin(). Every even remotely sane way down comes
>through there, so this should let us know how the system is getting
>there.
>
> However if it is tearing down the machine as aggressively as the
>filesystem damage makes it appear, then we need to fix what oracle is
>doing.
>
>-jan
>
>
>  
>
>------------------------------------------------------------------------
>
>updating /dev/dsk/c2t0d0s0...
>** /dev/rdsk/c2t0d0s0
>** Last Mounted on /
>** Phase 1 - Check Blocks and Sizes
>PARTIALLY TRUNCATED INODE I=190134
>SALVAGE? y
>
>1819044972 BAD I=190134
>1819044972 BAD I=190134
>1819044972 BAD I=190134
>1819044972 BAD I=190134
>1819044972 BAD I=190134
>1819044972 BAD I=190134
>1819044972 BAD I=190134
>1819044972 BAD I=190134
>1819044972 BAD I=190134
>1819044972 BAD I=190134
>EXCESSIVE BAD BLKS I=190134
>CONTINUE? y
>
>INCORRECT BLOCK COUNT I=190134 (640 should be 352)
>CORRECT? y
>
>1620344 DUP I=190321
>1620345 DUP I=190321
>1620346 DUP I=190321
>1620347 DUP I=190321
>1620348 DUP I=190321
>1620349 DUP I=190321
>1620350 DUP I=190321
>1620351 DUP I=190321
>1620352 DUP I=190321
>1620353 DUP I=190321
>EXCESSIVE DUP BLKS I=190321
>CONTINUE? y
>
>INCORRECT BLOCK COUNT I=190321 (16464 should be 18)
>CORRECT? y
>
>1620440 DUP I=190324
>1620441 DUP I=190324
>1867352 DUP I=215730
>1867360 DUP I=215730
>1867361 DUP I=215730
>1867362 DUP I=215730
>1867363 DUP I=215730
>1867364 DUP I=215730
>1867365 DUP I=215730
>1867366 DUP I=215730
>1867367 DUP I=215730
>1867368 DUP I=215730
>EXCESSIVE DUP BLKS I=215730
>CONTINUE? y
>
>INCORRECT BLOCK COUNT I=215730 (140 should be 80)
>CORRECT? y
>
>** Phase 1b - Rescan For More DUPS
>1620344 DUP I=7729
>1620345 DUP I=7729
>1620346 DUP I=7729
>1620347 DUP I=7729
>1620348 DUP I=7729
>1620349 DUP I=7729
>1620350 DUP I=7729
>1620351 DUP I=7729
>1620352 DUP I=190138
>1620440 DUP I=190138
>1620441 DUP I=190138
>1867360 DUP I=200646
>1867361 DUP I=200646
>1867362 DUP I=200646
>1867363 DUP I=200646
>1867364 DUP I=200646
>1867365 DUP I=200646
>1867366 DUP I=200646
>1867367 DUP I=200646
>1867352 DUP I=215729
>** Phase 2 - Check Pathnames
>DUP/BAD  I=190138  OWNER=root MODE=100600
>SIZE=2388992 MTIME=Feb 15 11:01 2006 
>FILE=/etc/svc/repository-boot-20060215_110135
>
>REMOVE? y
>
>DUP/BAD  I=7729  OWNER=adm MODE=100644
>SIZE=66960 MTIME=Feb 15 12:23 2006 
>FILE=/var/adm/wtmpx
>
>REMOVE? y
>
>UNALLOCATED  I=190140  OWNER=root MODE=0
>SIZE=0 MTIME=Feb 15 12:23 2006 
>NAME=/var/log/Xorg.0.log
>
>REMOVE? y
>
>DUP/BAD  I=190134  OWNER=root MODE=100644
>SIZE=315392 MTIME=Feb 15 12:23 2006 
>FILE=/var/sma_snmp/snmpd.conf
>
>REMOVE? y
>
>DUP/BAD  I=200646  OWNER=5000 MODE=100640
>SIZE=16200 MTIME=Feb 15 12:16 2006 
>FILE=/export/oracle/oracle/crs/evm/log/clear19_evmlog.20060215
>
>REMOVE? y
>
>DUP/BAD  I=215729  OWNER=5000 MODE=100644
>SIZE=5 MTIME=Feb 15 12:11 2006 
>FILE=/export/oracle/oracle/crs/log/clear19/cssd/clear19.pid
>
>REMOVE? y
>
>DUP/BAD  I=215730  OWNER=5000 MODE=100644
>SIZE=71002 MTIME=Feb 15 12:18 2006 
>FILE=/export/oracle/oracle/crs/log/clear19/cssd/ocssd.log
>
>REMOVE? y
>
>DIRECTORY CORRUPTED  I=262435  OWNER=5000 MODE=40750
>SIZE=512 MTIME=Feb 14 11:30 2006 
>DIR=/export/oracle/oracle/db/rdbms/log
>
>SALVAGE? y
>
>MISSING '.'  I=262435  OWNER=5000 MODE=40750
>SIZE=512 MTIME=Feb 14 11:30 2006 
>DIR=/export/oracle/oracle/db/rdbms/log
>
>FIX? y
>
>MISSING '..'  I=262435  OWNER=5000 MODE=40750
>SIZE=512 MTIME=Feb 14 11:30 2006 
>DIR=/export/oracle/oracle/db/rdbms/log
>
>FIX? y
>
>UNALLOCATED  I=269449  OWNER=root MODE=0
>SIZE=0 MTIME=Dec 31 16:00 1969 
>NAME=?
>
>REMOVE? y
>
>UNALLOCATED  I=269461  OWNER=root MODE=0
>SIZE=0 MTIME=Dec 31 16:00 1969 
>NAME=?
>
>REMOVE? y
>
>UNALLOCATED  I=269440  OWNER=root MODE=0
>SIZE=0 MTIME=Dec 31 16:00 1969 
>NAME=?
>
>REMOVE? y
>
>** Phase 3 - Check Connectivity
>UNREF DIR  I=269361  OWNER=5000 MODE=40755
>SIZE=512 MTIME=Jan 30 14:41 2006 
>RECONNECT? y
>
>DIR I=269361 CONNECTED. PARENT WAS I=190145
>
>UNREF DIRECTORY I=198874  OWNER=5000 MODE=40770
>SIZE=512 MTIME=Feb 14 11:09 2006 
>CLEAR? y
>
>UNREF DIRECTORY I=198257  OWNER=5000 MODE=40770
>SIZE=512 MTIME=Feb 14 11:09 2006 
>CLEAR? y
>
>** Phase 4 - Check Reference Counts
>BAD/DUP FILE I=7729  OWNER=adm MODE=100644
>SIZE=66960 MTIME=Feb 15 12:23 2006 
>CLEAR? y
>
>UNREF FILE I=7764  OWNER=root MODE=100644
>SIZE=275 MTIME=Jan 15 00:22 2003 
>CLEAR? y
>
>BAD/DUP FILE I=190134  OWNER=root MODE=100644
>SIZE=315392 MTIME=Feb 15 12:23 2006 
>CLEAR? y
>
>UNREF FILE  I=190135  OWNER=root MODE=100600
>SIZE=306 MTIME=Feb 15 12:23 2006 
>RECONNECT? y
>
>BAD/DUP FILE I=190138  OWNER=root MODE=100600
>SIZE=2388992 MTIME=Feb 15 11:01 2006 
>CLEAR? y
>
>UNREF FILE  I=190320  OWNER=root MODE=20620
>SIZE=0 MTIME=Feb 14 15:02 2006 
>CLEAR? y
>
>BAD/DUP FILE I=190321  OWNER=5000 MODE=100644
>SIZE=8414720 MTIME=Feb 14 15:03 2006 
>CLEAR? y
>
>UNREF FILE  I=190322  OWNER=5000 MODE=100644
>SIZE=166 MTIME=Jan 13 15:15 2006 
>RECONNECT? y
>
>UNREF FILE  I=190323  OWNER=5000 MODE=100644
>SIZE=9 MTIME=Jan  6 09:50 2006 
>RECONNECT? y
>
>BAD/DUP FILE I=190324  OWNER=5000 MODE=100644
>SIZE=1929 MTIME=Feb  1 17:25 2006 
>CLEAR? y
>
>BAD/DUP FILE I=200646  OWNER=5000 MODE=100640
>SIZE=16200 MTIME=Feb 15 12:16 2006 
>CLEAR? y
>
>BAD/DUP FILE I=215729  OWNER=5000 MODE=100644
>SIZE=5 MTIME=Feb 15 12:11 2006 
>CLEAR? y
>
>BAD/DUP FILE I=215730  OWNER=5000 MODE=100644
>SIZE=71002 MTIME=Feb 15 12:18 2006 
>CLEAR? y
>
>LINK COUNT DIR I=269361  OWNER=5000 MODE=40755
>SIZE=512 MTIME=Jan 30 14:41 2006  COUNT 7 SHOULD BE 5
>ADJUST? y
>
>LINK COUNT DIR I=269408  OWNER=5000 MODE=40755
>SIZE=512 MTIME=Jan 26 13:35 2006  COUNT 6 SHOULD BE 5
>ADJUST? y
>
>** Phase 5 - Check Cyl groups
>FREE BLK COUNT(S) WRONG IN SUPERBLK
>SALVAGE? y
>
>185171 files, 5962958 used, 60079364 free (196308 frags, 7485382 blocks, 0.3% 
>fragmentation)
>
>***** FILE SYSTEM WAS MODIFIED *****
># 
>
>  
>
>------------------------------------------------------------------------
>
>_______________________________________________
>smf-discuss mailing list
>smf-discuss at opensolaris.org
>  
>

-- 
<http://www.sun.com>    * Fernando Castano *
Staff engineer, MDE

*Sun Microsystems, Inc.*
260 Constitution Drive
Menlo Park, CA 94025 US
Phone x88904/+1 650 786 8904
Email Fernando.Castano at Sun.COM

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.opensolaris.org/pipermail/smf-discuss/attachments/20060216/66353e49/attachment.html>

Reply via email to