The basic issue seems to be that the archive wasn't synced on that
node. This sync normally happens automatically during any sort of
orderly shutdown or reboot. So, what exactly is oracle doing to 
instruct the nodes to reboot?

 Also, the particular files that are not in sync are not all that
fatal. /etc/path_to_inst holds instance data, and we should learn how
to merge the newer one. /etc/devices/devid_cache is just a cache of
hints that may make some things faster, but we really shouldn't worry
about it wrt the boot archive. So, on average you're safe to clear the
check and drive on.

-jan


> Juan Castano writes:
> 
> >   we are running some tests with oracle RAC.  we have 4 nodes (v40z's
> > with S10U1) that are sharing two storage devices with fully redundant
> > paths (2 HBAs, 2 FC switches, 2 3510s).  As part of the tests we need to
> > inject faults one of the switches (powercycle).  The 3510's contain only
> > oracle related data.  When the switch comes back up oracle instructs the
> > nodes to reboot. The problem we have is that only one nodes comes up. 
> > The other 3 nodes give me this msg in the console and refuse to boot.  I
> > tried the suggested fix (svcadm clear system/boot-archive) which didn't
> > work.  We have no idea how to
> > 
> > a) avoid this problem,
> > b) fix it. 
> > 
> > Have you seen this before?  any advise?  why is it happening?  is ti a bug?
> 
> (Is this Oracle RAC running with Sun Cluster?)
> 
> The service comes from the New Boot project.  Its job is to ensure that 
> the files in the boot archive (which is taken on a clean reboot/
> shutdown) match the files on the filesystem.
> 
> In your case, they didn't -- and you're told about which files mismatch.
> 
> The problem isn't really with SMF, but with whatever is getting those 
> two files out of sync for you.  I'll nudge the New Boot guys and ask 
> them to come take a look.
> 
> I'd guess what's perceived as the hot-plug event (the FC switch 
> powercycle) is causing those files to get updated and maybe Oracle 
> isn't causing a reboot through a path which causes the boot archive to 
> be resynced.  But, we'll let the new boot guys and device experts 
> (fortunately, they're the same bunch) weigh in.
> 
> liane
> 
> > 
> >  thanks in advance for any help,
> > 
> >   Fernando
> > 
> > WARNING - The following files in / differ from the boot archive:
> >     /etc/path_to_inst
> >     /etc/devices/devid_cache
> > The recommended action is to reboot and select "Solaris failsafe"
> > option from the boot menu. Then follow prompts to update the
> > boot archive.
> > To continue booting at your own risk, clear the service:
> > # svcadm clear system/boot-archive
> > 
> > Feb 14 16:39:20 svc.startd[7]: svc:/system/boot-archive:default: Method
> > "/lib/svc/method/boot-archive" failed with exit status 95.
> > [ system/boot-archive:default failed fatally (see 'svcs -x' for details) ]
> > Requesting System Maintenance Mode
> > (See /lib/svc/share/README for more information.)
> > Console login service(s) cannot run
> > 
> > Root password for system maintenance (control-d to bypass): Hostname:
> > clear20
> >
> 
> _______________________________________________
> smf-discuss mailing list
> smf-discuss at opensolaris.org



Reply via email to