Sounds like svm services did not get started on-time. However, they've been started by the time you manually run fsck. 'boot -m verbose' will give us more details.
-tony Sarah J. Jelinek wrote: > Hi Richard, > > Apologies for the delay in getting back to you. I am going to cross post this > to the ufs-discuss email list as well. > > It seems like based on the symptoms you are seeing that for some reason the > data UFS is getting during the fsck during boot is bad, in your examples of / > and /usr. And, that your subsequent fsck of that filesystem, based on your > subsequent correspondence with Sanjay Nadkarni nets no failures as shown in > this email to Sanjay: > > # fsck -F ufs /dev/md/rdsk/d30 > ** /dev/md/rdsk/d30 > ** Last Mounted on / > ** Phase 1 - Check Blocks and Sizes > ** Phase 2 - Check Pathnames > ** Phase 3 - Check Connectivity > ** Phase 4 - Check Reference Counts > ** Phase 5 - Check Cyl groups > 216878 files, 4949179 used, 3627460 free (386204 frags, 405157 blocks, 4.5% > fragmentation) > > So, it looks like to me that this is a potential mirror resync issue. > Although, your original email shows the device in question on your first test > system, d30, looks ok based on your metastat output. This could be a UFS > logging issue I suppose as well. > > Basically, it looks like something is making the system think the filesystem > in question needs a check, it forces you in to system maintenance mode, then > when you run fsck it all looks ok, so somehow it clears itself up. > > fs-usr is the script failing in both scenarios you sent data about, and in > svcs does remount the filesystem read/write from read only during boot which > is why it is doing fsck on these filesystems. > > So, I need a few things from you if possible to try to help me see where this > is failing: > > 1. Can you boot as follows: 'boot -m verbose' ,which should give me more data > about the SMv services that are running at the time of the failure. You will > have to halt your system to do this since it doesn't look like reboot > supports these arguments. However, you did say in your subsequent emails to > Sanjay that this does happen on second reboot as well. The only concern I > have with halting your system, and then booting may quiesce the filesytem > enough to mask this issue. But it is worth a try. > > 2. Can you modify your jumpstart install to have a single node mirror during > install, see if this problems continues to happen or not. Then after the > install, attach the other submirror. This would give me some data regarding > where this might be happening. Trying to isolate mirror resync issues from > UFS issues. > > 3. Had you seen this issue before b16? Just trying to narrow down the > putbacks to solaris to look at. > > thanks, > sarah > ****** > > >>Hi, >>I am seeing a problem with snv_16 and snv_18 that on >>reboot the mirrored file systems fail fsck. This >>problem is most noticable on the first reboot after >>my jumpstart builds. >>My two fcal disks are formated :- >>install_type initial_install >>system_type standalone >>partitioning explicit >>filesys mirror:d10 c0t0d0s0 c1t4d0s0 256 >> / logging >>filesys mirror:d20 c0t0d0s3 c1t4d0s3 4096 >> /var logging >>filesys mirror:d30 c0t0d0s4 c1t4d0s4 4096 >> /usr logging >>filesys mirror:d40 c0t0d0s5 c1t4d0s5 1536 >> /opt logging >>filesys mirror:d50 c0t0d0s7 c1t4d0s7 15360 >> /tmp2 logging >>filesys mirror:d60 c0t0d0s1 c1t4d0s1 free >>swap >>metadb c0t0d0s6 size 8192 count 4 >>metadb c1t4d0s6 size 8192 count 4 >>cluster SUNWCXall >>locale en_GB >> >>The install works file and I have put a metastat at >>the end of my finish script and all looks ok:- >>/sbin/metastat >>metastat: brscs02: >> system/metainit:default >> system/mdmonitor:default >>network/rpc/meta:default: service(s) not >>e(s) not online in SMF >> >>d60: Mirror >> Submirror 0: d61 >> State: Okay >> Submirror 1: d62 >> State: Okay >> Pass: 1 >> Read option: roundrobin (default) >> Write option: parallel (default) >> Size: 19182960 blocks (9.1 GB) >> >>d61: Submirror of d60 >> State: Okay >> Size: 19182960 blocks (9.1 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c0t0d0s1 0 No Okay >> Okay Yes >> >> >>d62: Submirror of d60 >> State: Okay >> Size: 19182960 blocks (9.1 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c1t4d0s1 0 No Okay >> Okay Yes >> >> >>d50: Mirror >> Submirror 0: d51 >> State: Okay >> Submirror 1: d52 >> State: Okay >> Pass: 1 >> Read option: roundrobin (default) >> Write option: parallel (default) >> Size: 31458321 blocks (15 GB) >> >>d51: Submirror of d50 >> State: Okay >> Size: 31458321 blocks (15 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c0t0d0s7 0 No Okay >> Okay Yes >> >> >>d52: Submirror of d50 >> State: Okay >> Size: 31458321 blocks (15 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c1t4d0s7 0 No Okay >> Okay Yes >> >> >>d40: Mirror >> Submirror 0: d41 >> State: Okay >> Submirror 1: d42 >> State: Okay >> Pass: 1 >> Read option: roundrobin (default) >> Write option: parallel (default) >> Size: 3146121 blocks (1.5 GB) >> >>d41: Submirror of d40 >> State: Okay >> Size: 3146121 blocks (1.5 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c0t0d0s5 0 No Okay >> Okay Yes >> >> >>d42: Submirror of d40 >> State: Okay >> Size: 3146121 blocks (1.5 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c1t4d0s5 0 No Okay >> Okay Yes >> >> >>d30: Mirror >> Submirror 0: d31 >> State: Okay >> Submirror 1: d32 >> State: Okay >> Pass: 1 >> Read option: roundrobin (default) >> Write option: parallel (default) >> Size: 8389656 blocks (4.0 GB) >> >>d31: Submirror of d30 >> State: Okay >> Size: 8389656 blocks (4.0 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c0t0d0s4 0 No Okay >> Okay Yes >> >> >>d32: Submirror of d30 >> State: Okay >> Size: 8389656 blocks (4.0 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c1t4d0s4 0 No Okay >> Okay Yes >> >> >>d20: Mirror >> Submirror 0: d21 >> State: Okay >> Submirror 1: d22 >> State: Okay >> Pass: 1 >> Read option: roundrobin (default) >> Write option: parallel (default) >> Size: 8389656 blocks (4.0 GB) >> >>d21: Submirror of d20 >> State: Okay >> Size: 8389656 blocks (4.0 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c0t0d0s3 0 No Okay >> Okay Yes >> >> >>d22: Submirror of d20 >> State: Okay >> Size: 8389656 blocks (4.0 GB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c1t4d0s3 0 No Okay >> Okay Yes >> >> >>d10: Mirror >> Submirror 0: d11 >> State: Okay >> Submirror 1: d12 >> State: Okay >> Pass: 1 >> Read option: roundrobin (default) >> Write option: parallel (default) >> Size: 525798 blocks (256 MB) >> >>d11: Submirror of d10 >> State: Okay >> Size: 525798 blocks (256 MB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c0t0d0s0 0 No Okay >> Okay Yes >> >> >>d12: Submirror of d10 >> State: Okay >> Size: 525798 blocks (256 MB) >> Stripe 0: >>Device Start Block Dbase State >> State Reloc Hot Spare >>c1t4d0s0 0 No Okay >> Okay Yes >> >> >>Device Relocation Information: >>Device Reloc Device ID >>c1t4d0 Yes id1,ssd at n20000020375c02d7 >>c0t0d0 Yes id1,ssd at n2000002037a13604 >> >> >>But when it reboots it sometimes fails:- >>Finish script E3500+login.sh execution completed. >> >>The begin script log 'begin.log' >>is located in /var/sadm/system/logs after reboot. >> >>The finish script log 'finish.log' >>is located in /var/sadm/system/logs after reboot. >> >>syncing file systems... done >>rebooting... >>Resetting... >>ttya initialized >>Using POST's System Configuration >>Setting up memory >>fhc ac simm-status environment sram flashprom >>SUNW,UltraSPARC-II >>Probing UPA Slot at 2,0 sbus fhc ac environment >>flashprom eeprom sbus-speed counter-timer >>Probing UPA Slot at 3,0 sbus counter-timer >>Probing /sbus at 2,0 at d,0 SUNW,socal sf ssd sf ssd >>Probing /sbus at 2,0 at 1,0 QLGC,isp sd st >>Probing /sbus at 2,0 at 2,0 Nothing there >>Probing /sbus at 3,0 at 3,0 SUNW,hme SUNW,fas sd st >>Probing /sbus at 3,0 at 0,0 network >>5-slot Sun Enterprise E3500, No Keyboard >>OpenBoot 3.2.30, 2048 MB memory installed, Serial >>#11240214. >>Copyright 2002 Sun Microsystems, Inc. All rights >>reserved >>Ethernet address 8:0:20:ab:83:16, Host ID: 80ab8316. >> >> >> >>Rebooting with command: boot >> >>Port#1 received soc-status=14 >>Port#0 received soc-status=14 loop 0 is ONLINE >>Boot device: disk File and args: >>Loading ufs-file-system package 1.4 04 Aug 1995 >>13:02:54. >>FCode UFS Reader 1.12 00/07/17 15:48:16. >>Loading: /platform/SUNW,Ultra-Enterprise/ufsboot >>Loading: /platform/sun4u/ufsboot >>SunOS Release 5.11 Version snv_16 64-bit >>Copyright 1983-2005 Sun Microsystems, Inc. All >>rights reserved. >>Use is subject to license terms. >>SUNW,sbus-gem0: Using Gigabit SERDES Interface >>SUNW,sbus-gem0: Auto-Negotiated 1000 Mbps Full-Duplex >>Link Up >>Hostname: brscs02 >>The / file system (/dev/md/rdsk/d10) is being >>checked. >>The /usr file system (/dev/md/rdsk/d30) is being >>checked. >> >>WARNING - Unable to repair the /usr filesystem. Run >>fsck >>manually (fsck -F ufs /dev/md/rdsk/d30). >> >>Jul 26 17:50:14 svc.startd[7]: >>svc:/system/filesystem/usr:default: Method >>"/lib/svc/method/fs-usr" failed with exit status 95. >>[ system/filesystem/usr:default failed fatally (see >>'svcs -x' for details) ] >>Requesting System Maintenance Mode >>(See /lib/svc/share/README for more information.) >>Console login service(s) cannot run >> >>Root password for system maintenance (control-d to >>bypass): >> >> >>Sometimes the reboot works but it checks the >>filesystems:- >> >>SUNW,sbus-gem0: Auto-Negotiated 1000 Mbps Full-Duplex >>Link Up >> >>Hostname: brscs02 >> >>The / file system (/dev/md/rdsk/d10) is being >>checked. >> >>Configuring devices. >> >>Loading smf(5) service descriptions: >> >> >>Other times another reboot will also fail. >> >> >>I have seen the problem on a Ultra 60 also with two >>onboard 9GB scsi drives >> >> >>Solaris 10 GA does not seem to have this problem. >> >>Cheers >>Richard. > > This message posted from opensolaris.org > _______________________________________________ > lvm-discuss mailing list > lvm-discuss at opensolaris.org