Sounds like svm services did not get started on-time. However, they've 
been started by the time you manually run fsck.  'boot -m verbose' will 
give us more details.

-tony

Sarah J. Jelinek wrote:
> Hi Richard,
> 
> Apologies for the delay in getting back to you. I am going to cross post this 
> to the ufs-discuss email list as well.
> 
> It seems like based on the symptoms you are seeing that for some reason the 
> data UFS is getting during the fsck during boot is bad, in your examples of / 
> and /usr.  And, that your subsequent fsck of that filesystem, based on your 
> subsequent correspondence with Sanjay Nadkarni nets no failures as shown in 
> this email to Sanjay:
> 
> # fsck -F ufs /dev/md/rdsk/d30
> ** /dev/md/rdsk/d30
> ** Last Mounted on /
> ** Phase 1 - Check Blocks and Sizes
> ** Phase 2 - Check Pathnames
> ** Phase 3 - Check Connectivity
> ** Phase 4 - Check Reference Counts
> ** Phase 5 - Check Cyl groups
> 216878 files, 4949179 used, 3627460 free (386204 frags, 405157 blocks, 4.5% 
> fragmentation)
> 
> So, it looks like to me that this is a potential mirror resync issue. 
> Although, your original email shows the device in question on your first test 
> system, d30, looks ok based on your metastat output. This could be a UFS 
> logging issue I suppose as well. 
> 
> Basically, it looks like something is making the system think the filesystem 
> in question needs a check, it forces you in to system maintenance mode, then 
> when you run fsck it all looks ok, so somehow it clears itself up. 
> 
> fs-usr is the script failing in both scenarios you sent data about, and in 
> svcs does remount the filesystem read/write from read only during boot which 
> is why it is doing fsck on these filesystems.
> 
> So, I need a few things from you if possible to try to help me see where this 
> is failing:
> 
> 1. Can you boot as follows: 'boot -m verbose' ,which should give me more data 
> about the SMv services that are running at the time of the failure.  You will 
> have to halt your system to do this since it doesn't look like reboot 
> supports these arguments. However, you did say in your subsequent emails to 
> Sanjay that this does happen on second reboot as well. The only concern I 
> have with halting your system, and then booting may quiesce the filesytem 
> enough to mask this issue. But it is worth a try.
> 
> 2. Can you modify your jumpstart install to have a single node mirror during 
> install, see if this problems continues to happen or not. Then after the 
> install, attach the other submirror. This would give me some data regarding 
> where this might be happening. Trying to isolate mirror resync issues from 
> UFS issues.
> 
> 3. Had you seen this issue before b16? Just trying to narrow down the 
> putbacks to solaris to look at.
> 
> thanks,
> sarah
> ******
> 
> 
>>Hi,
>>I am seeing a problem with snv_16 and snv_18 that on
>>reboot the mirrored file systems fail fsck. This
>>problem is most noticable on the first reboot after
>>my jumpstart builds.
>>My two fcal disks are formated :-
>>install_type    initial_install
>>system_type     standalone
>>partitioning    explicit
>>filesys         mirror:d10 c0t0d0s0 c1t4d0s0 256
>>       /       logging
>>filesys         mirror:d20 c0t0d0s3 c1t4d0s3 4096
>>      /var    logging
>>filesys         mirror:d30 c0t0d0s4 c1t4d0s4 4096
>>      /usr    logging
>>filesys         mirror:d40 c0t0d0s5 c1t4d0s5 1536
>>      /opt    logging
>>filesys         mirror:d50 c0t0d0s7 c1t4d0s7 15360
>>     /tmp2   logging
>>filesys         mirror:d60 c0t0d0s1 c1t4d0s1 free
>>swap
>>metadb          c0t0d0s6 size 8192 count 4
>>metadb          c1t4d0s6 size 8192 count 4
>>cluster         SUNWCXall
>>locale          en_GB
>>
>>The install works file and I have put a metastat at
>>the end of my finish script and all looks ok:-
>>/sbin/metastat
>>metastat: brscs02: 
>>        system/metainit:default
>>        system/mdmonitor:default
>>network/rpc/meta:default: service(s) not
>>e(s) not online in SMF
>>
>>d60: Mirror
>>    Submirror 0: d61
>>      State: Okay         
>>    Submirror 1: d62
>>      State: Okay         
>>    Pass: 1
>>    Read option: roundrobin (default)
>>    Write option: parallel (default)
>>    Size: 19182960 blocks (9.1 GB)
>>
>>d61: Submirror of d60
>>    State: Okay         
>>    Size: 19182960 blocks (9.1 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c0t0d0s1          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d62: Submirror of d60
>>    State: Okay         
>>    Size: 19182960 blocks (9.1 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c1t4d0s1          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d50: Mirror
>>    Submirror 0: d51
>>      State: Okay         
>>    Submirror 1: d52
>>      State: Okay         
>>    Pass: 1
>>    Read option: roundrobin (default)
>>    Write option: parallel (default)
>>    Size: 31458321 blocks (15 GB)
>>
>>d51: Submirror of d50
>>    State: Okay         
>>    Size: 31458321 blocks (15 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c0t0d0s7          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d52: Submirror of d50
>>    State: Okay         
>>    Size: 31458321 blocks (15 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c1t4d0s7          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d40: Mirror
>>    Submirror 0: d41
>>      State: Okay         
>>    Submirror 1: d42
>>      State: Okay         
>>    Pass: 1
>>    Read option: roundrobin (default)
>>    Write option: parallel (default)
>>    Size: 3146121 blocks (1.5 GB)
>>
>>d41: Submirror of d40
>>    State: Okay         
>>    Size: 3146121 blocks (1.5 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c0t0d0s5          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d42: Submirror of d40
>>    State: Okay         
>>    Size: 3146121 blocks (1.5 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c1t4d0s5          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d30: Mirror
>>    Submirror 0: d31
>>      State: Okay         
>>    Submirror 1: d32
>>      State: Okay         
>>    Pass: 1
>>    Read option: roundrobin (default)
>>    Write option: parallel (default)
>>    Size: 8389656 blocks (4.0 GB)
>>
>>d31: Submirror of d30
>>    State: Okay         
>>    Size: 8389656 blocks (4.0 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c0t0d0s4          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d32: Submirror of d30
>>    State: Okay         
>>    Size: 8389656 blocks (4.0 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c1t4d0s4          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d20: Mirror
>>    Submirror 0: d21
>>      State: Okay         
>>    Submirror 1: d22
>>      State: Okay         
>>    Pass: 1
>>    Read option: roundrobin (default)
>>    Write option: parallel (default)
>>    Size: 8389656 blocks (4.0 GB)
>>
>>d21: Submirror of d20
>>    State: Okay         
>>    Size: 8389656 blocks (4.0 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c0t0d0s3          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d22: Submirror of d20
>>    State: Okay         
>>    Size: 8389656 blocks (4.0 GB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c1t4d0s3          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d10: Mirror
>>    Submirror 0: d11
>>      State: Okay         
>>    Submirror 1: d12
>>      State: Okay         
>>    Pass: 1
>>    Read option: roundrobin (default)
>>    Write option: parallel (default)
>>    Size: 525798 blocks (256 MB)
>>
>>d11: Submirror of d10
>>    State: Okay         
>>    Size: 525798 blocks (256 MB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c0t0d0s0          0     No            Okay
>>   Okay   Yes 
>>
>>
>>d12: Submirror of d10
>>    State: Okay         
>>    Size: 525798 blocks (256 MB)
>>    Stripe 0:
>>Device     Start Block  Dbase        State
>>  State Reloc Hot Spare
>>c1t4d0s0          0     No            Okay
>>   Okay   Yes 
>>
>>
>>Device Relocation Information:
>>Device   Reloc  Device ID
>>c1t4d0   Yes    id1,ssd at n20000020375c02d7
>>c0t0d0   Yes    id1,ssd at n2000002037a13604
>>
>>
>>But when it reboots it sometimes fails:-
>>Finish script E3500+login.sh execution completed.
>>
>>The begin script log 'begin.log'
>>is located in /var/sadm/system/logs after reboot.
>>
>>The finish script log 'finish.log'
>>is located in /var/sadm/system/logs after reboot.
>>
>>syncing file systems... done
>>rebooting...
>>Resetting... 
>>ttya initialized
>>Using POST's System Configuration
>>Setting up memory
>>fhc ac simm-status environment sram flashprom
>>SUNW,UltraSPARC-II 
>>Probing UPA Slot at 2,0   sbus fhc ac environment
>>flashprom eeprom sbus-speed counter-timer 
>>Probing UPA Slot at 3,0   sbus counter-timer 
>>Probing /sbus at 2,0 at d,0  SUNW,socal sf ssd sf ssd 
>>Probing /sbus at 2,0 at 1,0  QLGC,isp sd st 
>>Probing /sbus at 2,0 at 2,0  Nothing there
>>Probing /sbus at 3,0 at 3,0  SUNW,hme SUNW,fas sd st 
>>Probing /sbus at 3,0 at 0,0  network 
>>5-slot Sun Enterprise E3500, No Keyboard
>>OpenBoot 3.2.30, 2048 MB memory installed, Serial
>>#11240214.
>>Copyright 2002 Sun Microsystems, Inc.  All rights
>>reserved
>>Ethernet address 8:0:20:ab:83:16, Host ID: 80ab8316.
>>
>>
>>
>>Rebooting with command: boot
>>
>>Port#1 received soc-status=14 
>>Port#0 received soc-status=14 loop 0 is ONLINE
>>Boot device: disk  File and args: 
>>Loading ufs-file-system package 1.4 04 Aug 1995
>>13:02:54. 
>>FCode UFS Reader 1.12 00/07/17 15:48:16. 
>>Loading: /platform/SUNW,Ultra-Enterprise/ufsboot
>>Loading: /platform/sun4u/ufsboot
>>SunOS Release 5.11 Version snv_16 64-bit
>>Copyright 1983-2005 Sun Microsystems, Inc.  All
>>rights reserved.
>>Use is subject to license terms.
>>SUNW,sbus-gem0: Using Gigabit SERDES Interface
>>SUNW,sbus-gem0: Auto-Negotiated 1000 Mbps Full-Duplex
>>Link Up
>>Hostname: brscs02
>>The / file system (/dev/md/rdsk/d10) is being
>>checked.
>>The /usr file system (/dev/md/rdsk/d30) is being
>>checked.
>>
>>WARNING - Unable to repair the /usr filesystem. Run
>>fsck
>>manually (fsck -F ufs /dev/md/rdsk/d30).
>>
>>Jul 26 17:50:14 svc.startd[7]:
>>svc:/system/filesystem/usr:default: Method
>>"/lib/svc/method/fs-usr" failed with exit status 95.
>>[ system/filesystem/usr:default failed fatally (see
>>'svcs -x' for details) ]
>>Requesting System Maintenance Mode
>>(See /lib/svc/share/README for more information.)
>>Console login service(s) cannot run
>>
>>Root password for system maintenance (control-d to
>>bypass): 
>>
>>
>>Sometimes the reboot works but it checks the
>>filesystems:-
>>
>>SUNW,sbus-gem0: Auto-Negotiated 1000 Mbps Full-Duplex
>>Link Up
>>
>>Hostname: brscs02
>>
>>The / file system (/dev/md/rdsk/d10) is being
>>checked.
>>
>>Configuring devices.
>>
>>Loading smf(5) service descriptions:   
>>
>>
>>Other times another reboot will also fail.
>>
>>
>>I have seen the problem on a Ultra 60 also with two
>>onboard 9GB scsi drives
>>
>>
>>Solaris 10 GA does not seem to have this problem.
>>
>>Cheers
>>Richard.
> 
> This message posted from opensolaris.org
> _______________________________________________
> lvm-discuss mailing list
> lvm-discuss at opensolaris.org


Reply via email to