You seem to be arguing that we can come up with old data without 
rolling/committing logs. If that's acceptable, then we can some up 
with the old archive as well and can just eliminate the check.

 I get the impression that you're placing some value on how old the
data is. However in the case of a non-interfaced binary kernel
component that really doesn't matter. It only matters if it's 
compatible with the rest of the bits or not. So either way we'd need 
the check.

-jan


> (earlier messages in this thread should be visible in a day or so at
> http://www.opensolaris.org/os/community/arc/caselog/2006/525/mail;
> sorry for the confusion).
> 
> On Wed, 2007-08-01 at 16:15 -0700, Jan Setje-Eilers wrote:
> > So you prefer to load corrupt data in the case of a reboot soon 
> > after an update to an archive check warning?
> 
> Not corrupt data, slightly old data.
> 
> I realize that there are lingering nightmares of boots interrupted by
> corruption caused by a long-standing bug in logging UFS, but that bug --
> 4782952 -- was fixed late in s10.
> 
> During s10 development I had to manually recover a bunch of systems,
> typically due to munched md.conf files.  logging ufs before the fix to
> 4782952 permitted blocks belonging to a file stably stored to be freed
> and then overwritten before the transaction which freed the block
> committed.
> 
> (now, as is typical for changes to solaris UFS, that fix needed some
> followup work, but as best I can tell, things have damped out..)
> 
> I saw boot failures on mirrored root sparc systems on a regular basis
> before this bug was fixed and haven't seen them since.  I don't want to
> start seeing them again if this project integrates.
> 
> > > In the case of ZFS root, my understanding is that the worst that can
> > > happen if we don't commit the intent log before reading is that we will
> > > read /etc/system contents which doesn't contain edits made during the
> > > last few seconds before a crash.  
> > 
> >  I can't confirm or deny that,
> 
> You don't need to; Neil Perrin confirmed this recently; see the mail log
> for 2007/171 (ZFS Separate Intent Log).  I asked:
> 
> > As I understand it, loss of the information in the intent log means that
> > the last few seconds of changes to a pool have been lost, but the pool
> > is otherwise intact.
> 
> His response was "True"
> 
> > but even if that's the case you then have no way to know that the old copy 
> > was loaded.
> 
> I'm not sure that's a problem -- you're booting a point-in-time
> consistent config, just not necessarily the up-to-the-millisecond config
> at the time of the crash.
> 
> Now, that's not good enough if we crash in the middle of a pkgadd -- but
> then the boot archive doesn't help very much then, either (because once
> we come up and discard the boot archive we may still load a mix of old
> and new kernel modules)
> 
> For cases where there isn't a need for consistent updates to multiple
> files we should be fine.
> 
> > > If we get /etc/system out of the boot archive, it may be months out of
> > > date.
> > 
> >  In which case we catch this when the archive contents is verified.
> 
> And essentially crash/hang until an expert comes along to rescue the
> system, which is IMHO unacceptable behavior.
> 
> > > >  It's also potentially very unsafe to do so due to the log issue.
> > > 
> > > Huh?  Not with zfs root -- the on-disk state will be self-consistent as
> > > of the last time an uberblock update committed -- at most a few seconds
> > > old.
> > 
> >  But it is with ufs. 
> 
> As best as I can tell, it's not been unsafe for ufs since 4782952 was
> fixed.  I saw lots of problems on pre-FCS s10, but I've never seen
> problems of lufs corruption breaking boot on sparc systems running s10
> FCS or nevada.
> 
> >  If they aren't in the archive, then their state would have to be
> > managed to ensure that they aren't unrolled or uncommitted.
> 
> But that's not how (working) lufs and zfs actually work.  If the code
> which updates these files does the usual copy-edit-fsync-rename dance,
> there should never be a window where the on-disk structure even
> *without* the log contains something other than either the old or the
> new version of the file.
> 
> 



Reply via email to