Re: -current unusable after a crash

2002-11-25 Thread Marcin Dalecki
Dan Nelson wrote:

In the last episode (Nov 25), Terry Lambert said:


Marcin Dalecki wrote:


I don't think this is really possible.

I went looking for a generic "application use" CMOS are for this
sort of thing a while back, and I was unable to find one.


Well you should please take a look at the "fast boot" option of
moderately modern BIOS-es. Somthing along those lines went right
now in to the linux kernel. Seems pretty adequate to me, since you
would be even able to controll it through the BIOS setup...


Is there documentation available for this anywhere?  The BIOS vendor
documentation, not the Linux source code.



http://www.microsoft.com/hwdev/resources/specs/simp_bios.asp
http://www.microsoft.com/hwdev/resources/specs/simp_boot.asp 

is the best I could find; you'll need a Word doc viewer. 

OpenOffice works fine on my FreeBSD setup ;-).


It's mainly
geared toward detecting boot failure rather than abnormal shutdowns,
though. What we need is a matching "Simple Shutdown Flag" variable.


Personally I was not that much thinking about the particular problem at
hand. I think the fast boot BIOS preference should be simply exported
by the kernel as a sysconf erm. sorry sysctl constant value to allow
for easy checking by userland. One could imagine that a lot
of other preferences could be controlled by it as well. Like
for example starting dhclient dettached from the current console, just
to give the user a login prompt as fast as possible and so on...

--
	Marcin Dalecki


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert
Dan Nelson wrote:
> > Is there documentation available for this anywhere?  The BIOS vendor
> > documentation, not the Linux source code.
> 
> http://www.microsoft.com/hwdev/resources/specs/simp_bios.asp
> http://www.microsoft.com/hwdev/resources/specs/simp_boot.asp
> 
> is the best I could find; you'll need a Word doc viewer.  It's mainly
> geared toward detecting boot failure rather than abnormal shutdowns,
> though. What we need is a matching "Simple Shutdown Flag" variable.

The license you have to agree to to download it permits implementation
for firmware, but not for the OS: 1(a)(iii), 1(b), 2(b)(b), 3.

According to the documentation at the end of the page of the URL
you posted above, the OS has to be full PnP compliant for it to
work as expected, and multiboot is not supported.

For thise interested in pursuing this, more information (no license
agreement required) is available from:

http://www.microsoft.com/hwdev/platform/performance/fastboot/fastboot-winxp.asp

Though I expect you won't be able to implement without the specification.

I guess looking at the Linux code as a reference is OK... they
violated the license, not you, so it's not the same thing as you
violating the license.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Dan Nelson
In the last episode (Nov 25), Terry Lambert said:
> Marcin Dalecki wrote:
> > > I don't think this is really possible.
> > >
> > > I went looking for a generic "application use" CMOS are for this
> > > sort of thing a while back, and I was unable to find one.
> > 
> > Well you should please take a look at the "fast boot" option of
> > moderately modern BIOS-es. Somthing along those lines went right
> > now in to the linux kernel. Seems pretty adequate to me, since you
> > would be even able to controll it through the BIOS setup...
> 
> Is there documentation available for this anywhere?  The BIOS vendor
> documentation, not the Linux source code.

http://www.microsoft.com/hwdev/resources/specs/simp_bios.asp
http://www.microsoft.com/hwdev/resources/specs/simp_boot.asp 

is the best I could find; you'll need a Word doc viewer.  It's mainly
geared toward detecting boot failure rather than abnormal shutdowns,
though. What we need is a matching "Simple Shutdown Flag" variable.

-- 
Dan Nelson
[EMAIL PROTECTED]

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert
Marcin Dalecki wrote:
> > I don't think this is really possible.
> >
> > I went looking for a generic "application use" CMOS are for this
> > sort of thing a while back, and I was unable to find one.
> 
> Well you should please take a look at the "fast boot" option
> of moderately modern BIOS-es. Somthing along those lines went right now
> in to the linux kernel. Seems pretty adequate to me, since you would
> be even able to controll it through the BIOS setup...

Is there documentation available for this anywhere?  The BIOS
vendor documentation, not the Linux source code.

My gut feeling is that this isn't going to be too helpful,
without AC failure notification with a DC holdup time.

The problem is that the best case is power failure, and the
worst case is a corrupted GDT and a double panic off a trap 12
in the trap 12 handler (such that you would get a trap 12 when
you tried to write back to the CMOS that this was the worst
case, not the best case).

Basically, you are still stuck needing power failure notification,
so you can write the cause of the failure back.

At startup, you have to set the saved state to "worst possible
failure: no way to update cause of failure in CMOS", and then
back off to softer failure modes from there.

I think this "Fast boot" stuff is useful, but the way it's
useful is if your main memory is reflected to a seperate area
of the disk, so that you can bring up the system image very
quickly.

Basically, it means that it's not at all useful for the problem
at hand, unless it provides for power fail notification.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert
Brad Knowles wrote:
> At 2:02 PM -0800 2002/11/25, Terry Lambert wrote:
> >  If you made system dumps mandatory (or marked swap with a non-dump
> >  header in case of panic), this still would not handle the "silent
> >  reboot", "double panic", or "single panic with disk I/O trashed"
> >  cases.  8-(.
> 
> How about we do the safe thing, and only do background fsck if we
> can prove that the system state is something where it would be
> suitable?  Or would that mean that we almost never do background fsck?

It would mean that you can *never* background fsck safely.

The problem is that you need to distinguish a power failure,
which is technically the only safe time to do it, from all
other failure modes.

You can distinguish, at least on R/W FS's, whether or not to
do any fsck (by looking at the "clean" bit), but all other bets
are off.

One approach that works well for desktop systems is to implement
a "soft read-only".  We did this at Artisoft in 1995/1996, when
we ported the VFS stacking framework to Windows 95, and first
implemented a soft updates for FFS/UFS, which we ported to run
on Windows 95 under the stacking framework.

The way a "soft read-only" works is to leave the FS mounted
read/write, and then insert at write attempts, everywhere that
read-only is checked, a check for a "soft read-only" bit on
the in-core superblock.

Basically, we flush out all writeable state to the FS, and then
set the clean bit in the superblock, and flush it to disk, if
I/O on the FS has been idle for a while.

Then, when someone wants to write it, we reset the "dirty" bit,
flush the superblock back out to disk, and, once we know that
the change has been committed to stable storage, we permit the
write operation to continue.


There's actually some problems that now exist in the sync code
in FreeBSD that result in unnecessary writes to the disk, these
days, which make it hard to implement this (the system basically
sync's disk buffers that don't need to be sync'ed, at intervals);
that would have to be fixed before such a system can be used.

The result is a box you can just "turn off", without trashing the
FS, assuming it's relatively quiescent, relative to FS writes
(e.g. desktop systems, as I said at the top).

Similarly, if the system were to panic, lose power, whatever, at
this point, then the FS's would be clean, and come back up with
no need to fsck.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert
Kris Kennaway wrote:
> On Mon, Nov 25, 2002 at 02:02:14PM -0800, Terry Lambert wrote:
> > I don't think this is really possible.
> 
> Yeah :(
> 
> > If you made system dumps mandatory (or marked swap with a non-dump
> > header in case of panic), this still would not handle the "silent
> > reboot", "double panic", or "single panic with disk I/O trashed"
> > cases.  8-(.
> 
> And the panics that affect the disk/filesystem are likely to not give
> a crashdump, but at the same time are likely to cause FS problems for
> bgfsck :-(

Actually, the worst problems come when the corruption does not
result in a crash subsequently.

If you just crashed again, you could simply set in the superblock
a flag that said "background fsck in progress", and if that flag
was set at boot time, then do a full fsck (knowing you died during
a background fsck).

If you don't get a second crash, and you reboot, you're screwed.

You could add another utility to say "force full fsck" -- basically,
to set the flag manually.  This is a pain because you have to do it
through an fcntl() or ioctl(), since there are no block devices to
use to do the work, and you can't open a mounted device to write it,
even if you know what you are doing, the OS enforces like it's
smarter than you.

We ran into exactly this same problem in the InterJet, when we first
paid Kirk to have soft updates ported to FreeBSD (I actually did the
preliminary "make it compile" work, and Julian did most of the
debugging; I helped some after that, but my boss didn't like me
doing it).  The point was to get rid of the need for a UPS in the
InterJet.

A log structured FS doesn't actually have this problem, but is a
real pain because of the need for a "cleaner" to run constantly,
to garbage collect, which makes thing that used to be deterministic
time take variable time.  Not very good for multimedia or streaming
content serving.

The InterJet handled this by having a DC holdup time following AC
failure notification, which was enough to throw a stick into the
spokes, to prevent the wheels from turning, and the bicycle falling
over the cliff.

Another way to handle it would be CMOS, with a BIOS initialization
(e.g. set bit 1 of the "crash state") that didn't effect the bits
that indicated the failure mode.

Unfortunately, the computer manufacturers have not really agreed
on a standard for this sort of thing, nor do they think anyone in
OS space or userland should be able to own a section of CMOS
memory (no OS allocation policy, tagging, etc.).

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Brad Knowles
At 2:02 PM -0800 2002/11/25, Terry Lambert wrote:


 If you made system dumps mandatory (or marked swap with a non-dump
 header in case of panic), this still would not handle the "silent
 reboot", "double panic", or "single panic with disk I/O trashed"
 cases.  8-(.


	How about we do the safe thing, and only do background fsck if we 
can prove that the system state is something where it would be 
suitable?  Or would that mean that we almost never do background fsck?

--
Brad Knowles, <[EMAIL PROTECTED]>

"They that can give up essential liberty to obtain a little temporary
safety deserve neither liberty nor safety."
-Benjamin Franklin, Historical Review of Pennsylvania.

GCS/IT d+(-) s:+(++)>: a C++(+++)$ UMBSHI$ P+>++ L+ !E W+++(--) N+ !w---
O- M++ V PS++(+++) PE- Y+(++) PGP>+++ t+(+++) 5++(+++) X++(+++) R+(+++)
tv+(+++) b+() DI+() D+(++) G+() e++> h--- r---(+++)* z(+++)

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message


Re: -current unusable after a crash

2002-11-25 Thread Kris Kennaway
On Mon, Nov 25, 2002 at 02:02:14PM -0800, Terry Lambert wrote:

> I don't think this is really possible.

Yeah :(

> If you made system dumps mandatory (or marked swap with a non-dump
> header in case of panic), this still would not handle the "silent
> reboot", "double panic", or "single panic with disk I/O trashed"
> cases.  8-(.

And the panics that affect the disk/filesystem are likely to not give
a crashdump, but at the same time are likely to cause FS problems for
bgfsck :-(

Kris

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Marcin Dalecki
Terry Lambert wrote:

Kris Kennaway wrote:


On Mon, Nov 25, 2002 at 10:24:46AM -0500, Robert Watson wrote:


I thought, this might be due to the priority of the background fsck and
have once left it alone for several hours -- with no effect. The usual
fsck takes a few minutes.



We really need to disable background fsck if the system panicked.
I've seen far too much bizarre filesystem behaviour that went away the
next time I did a full fsck.



I don't think this is really possible.

I went looking for a generic "application use" CMOS are for this
sort of thing a while back, and I was unable to find one.



Well you should please take a look at the "fast boot" option
of moderately modern BIOS-es. Somthing along those lines went right now
in to the linux kernel. Seems pretty adequate to me, since you would
be even able to controll it through the BIOS setup...


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert
Mikhail Teterin wrote:
> On Monday 25 November 2002 12:24 pm, Kris Kennaway wrote:
> = On Mon, Nov 25, 2002 at 10:24:46AM -0500, Robert Watson wrote:
> =
> = > > I thought, this might be due to the priority of the background
> = > > fsck and have once left it alone for several hours -- with no
> = > > effect. The usual fsck takes a few minutes.
> =
> = We really need to disable background fsck if the system panicked.
> 
> Otherwise, is there a need for fsck at all? Can sudden powerloss be
> reliably distinguished from a panic?

No, nor from hardware failures (disk/controller/other), without
NVRAM to save the crash reason in the case there is one.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Terry Lambert
Kris Kennaway wrote:
> On Mon, Nov 25, 2002 at 10:24:46AM -0500, Robert Watson wrote:
> > > I thought, this might be due to the priority of the background fsck and
> > > have once left it alone for several hours -- with no effect. The usual
> > > fsck takes a few minutes.
> 
> We really need to disable background fsck if the system panicked.
> I've seen far too much bizarre filesystem behaviour that went away the
> next time I did a full fsck.

I don't think this is really possible.

I went looking for a generic "application use" CMOS are for this
sort of thing a while back, and I was unable to find one.

If you made system dumps mandatory (or marked swap with a non-dump
header in case of panic), this still would not handle the "silent
reboot", "double panic", or "single panic with disk I/O trashed"
cases.  8-(.

There was a discussion about these issues when background fsck
first went in.  My opinion of having it on by default is that if
you are going to play that loose, you might as well mount the FSs
async, and be done with it.

-- Terry

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Archie Cobbs
Mikhail Teterin wrote:
> The only way to get my -current system back to normal after a crash is
> to boot into single user and do an explicit ``fsck -p''.
> 
> Otherwise the system will, seemingly, boot fine, but none of the ttyvs
> will accept any input, although tty-switching works fine. Remote
> connections (ssh, telnet) don't bring up the login prompt.

FYI, "me too". Manual fsck after booting single user mode fixed it.

-Archie

__
Archie Cobbs * Packet Design * http://www.packetdesign.com

To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Mikhail Teterin
On Monday 25 November 2002 12:24 pm, Kris Kennaway wrote:
= On Mon, Nov 25, 2002 at 10:24:46AM -0500, Robert Watson wrote:
= 
= > > I thought, this might be due to the priority of the background
= > > fsck and have once left it alone for several hours -- with no
= > > effect. The usual fsck takes a few minutes.
=
= We really need to disable background fsck if the system panicked.

Otherwise, is there a need for fsck at all? Can sudden powerloss be
reliably distinguished from a panic?

= I've seen far too much bizarre filesystem behaviour that went away the
= next time I did a full fsck.

-mi


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



Re: -current unusable after a crash

2002-11-25 Thread Kris Kennaway
On Mon, Nov 25, 2002 at 10:24:46AM -0500, Robert Watson wrote:

> > I thought, this might be due to the priority of the background fsck and
> > have once left it alone for several hours -- with no effect. The usual
> > fsck takes a few minutes. 

We really need to disable background fsck if the system panicked.
I've seen far too much bizarre filesystem behaviour that went away the
next time I did a full fsck.

Kris



msg47422/pgp0.pgp
Description: PGP signature


Re: -current unusable after a crash

2002-11-25 Thread Robert Watson

On Mon, 25 Nov 2002, Mikhail Teterin wrote:

> The only way to get my -current system back to normal after a crash is
> to boot into single user and do an explicit ``fsck -p''. 
> 
> Otherwise the system will, seemingly, boot fine, but none of the ttyvs
> will accept any input, although tty-switching works fine. Remote
> connections (ssh, telnet) don't bring up the login prompt. 
> 
> I thought, this might be due to the priority of the background fsck and
> have once left it alone for several hours -- with no effect. The usual
> fsck takes a few minutes. 
> 
> There are three drives in the system -- a 4G SCSI (on ahc0) with /,
> /usr, /opt, and /home on it, and two 30Gb IDEs coupled into one big ccd. 

Any chance we can get you to break into ddb on the console, do a ps, and
see what the processes are waiting for?  Also, if they're waiting on
something like "ufs" or "inode", generated ddb traces of the processes
would be interesting.

Robert N M Watson FreeBSD Core Team, TrustedBSD Projects
[EMAIL PROTECTED]  Network Associates Laboratories



To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message



-current unusable after a crash

2002-11-25 Thread Mikhail Teterin
The only way to get my -current system back to normal after a crash is
to boot into single user and do an explicit ``fsck -p''.

Otherwise the system will, seemingly, boot fine, but none of the ttyvs
will accept any input, although tty-switching works fine. Remote
connections (ssh, telnet) don't bring up the login prompt.

I thought, this might be due to the priority of the background fsck and
have once left it alone for several hours -- with no effect. The usual
fsck takes a few minutes.

There are three drives in the system -- a 4G SCSI (on ahc0) with /, /usr,
/opt, and /home on it, and two 30Gb IDEs coupled into one big ccd.

-mi


To Unsubscribe: send mail to [EMAIL PROTECTED]
with "unsubscribe freebsd-current" in the body of the message