Re: FreeBSD 5.3 file system troubles

2005-01-26 Thread Scott I. Remick
On Wed, 26 Jan 2005 05:50:02 -0700 (MST), Terry R. Friedrichsen wrote:

 Is anybody besides *me* having file system corruption problems with FreeBSD
 5.3?  I've looked around on several of the mailing lists and found no men-
 tion of this.

Not the same problem as you, but I've been getting frequent ffs panics with
5.3 that I never got with 5.2.1. I didn't know the actual error at first
because I'm in X most of the time and they wouldn't appear there (system
would simply lock up). It wasn't until I started trying to update some
ports from console only that I caught the error. It only seems to happen
during periods of intense disk activity (writes?). 

I have the actual error written down at home. It always causes an fsck mess
upon starting up again, which makes me nervous. There's certain tasks I
simply cannot do anymore because they're write-intensive and I know they'll
trigger the panic.

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 5.3 file system troubles

2005-01-26 Thread Nick Pavlica
I have been testing 5.3 (Standard Install/Default settings) and
haven't had any file system corruption.  However, the I/O performance
results from my testing currently show that there is a major
difference between 4.11 and 5.3 (4.11 is much faster!).  I have a
suspicion that these issues may be related to some core issues with
5.3 that need to cleared up.


On Wed, 26 Jan 2005 05:50:02 -0700 (MST), Terry R. Friedrichsen
[EMAIL PROTECTED] wrote:
 
 Is anybody besides *me* having file system corruption problems with FreeBSD
 5.3?  I've looked around on several of the mailing lists and found no men-
 tion of this.
 
 I have two different platforms on which I'm trying to run FreeBSD 5.3.  One
 is an x86 SMP system (dual AMD Athlon 1900+) and the other is an Alpha DS-10.
 
 On the SMP system, doing anything I/O intensive (like a kernel build) quickly
 corrupts the file system - I start to encounter problems like being unable to
 remove entire directory trees because the system thinks that empty directories
 are not *really* empty and therefore cannot be deleted.  Other problems occur,
 too.
 
 On the Alpha system, I'm trying to get Xorg to work, with no success.  What
 normally happens is that the system locks up *totally* either when trying to
 configure X or when running the X server after configure generates a config
 file (I'm trying multiple versions of Xorg).
 
 The lockup means that I have to power-cycle the system to reboot.  When I do
 this, the filesystem is *always* horribly damaged.  I finally gave up when I
 couldn't even get into sh in single-abuser mode because /libexec/ld.so.1
 was no longer there ...
 
 What I'm going to try next is pulling one CPU out of the SMP system to see if
 that helps.  On the Alpha, I'm just going to give up on Xorg for a while.
 
 I'd hate to have to drop back to 4.10 or 4.11 ...
 
 If anyone has any suggestions, or even just sympathetic words, I'd be happy
 to hear them!
 
 Thanks.
 
 Terry R. Friedrichsen
 
 [EMAIL PROTECTED]
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


re: FreeBSD 5.3 file system troubles

2005-01-26 Thread Terry R. Friedrichsen

Thanks for responding to my inquiry.  If it fits into your testing program,
try running something that works the file system and simply turn off the
system power in the middle of it.

Twice, now, doing this on my Alpha has rendered the system unrecoverable at
boot time, necessitating a reinstall.

Terry

[EMAIL PROTECTED]
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 5.3 file system troubles

2005-01-26 Thread Nick Pavlica
That same thought ran thought my mind when I was testing.  I started a
process that does heavy writing and literally pulled the plug during
the middle of the operation.  I plugged it back in and the box came
back up without a hitch.  I did all my testing on x86 boxes using SCSI
and IDE drives. I currently don't have access to any Alpha boxes to
test on them.  I'm not a big fan of Alpha, but the DS10 has always
been a great workhorse in my experience.  Is the firmware etc up to
date on that box?

--Nick


On Wed, 26 Jan 2005 12:42:47 -0700 (MST), Terry R. Friedrichsen
[EMAIL PROTECTED] wrote:
 
 Thanks for responding to my inquiry.  If it fits into your testing program,
 try running something that works the file system and simply turn off the
 system power in the middle of it.
 
 Twice, now, doing this on my Alpha has rendered the system unrecoverable at
 boot time, necessitating a reinstall.
 
 Terry
 
 [EMAIL PROTECTED]

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 5.3 file system troubles

2005-01-26 Thread Kris Kennaway
On Wed, Jan 26, 2005 at 05:50:02AM -0700, Terry R. Friedrichsen wrote:
 
 Is anybody besides *me* having file system corruption problems with FreeBSD
 5.3?  I've looked around on several of the mailing lists and found no men-
 tion of this.
 
 I have two different platforms on which I'm trying to run FreeBSD 5.3.  One
 is an x86 SMP system (dual AMD Athlon 1900+) and the other is an Alpha DS-10.
 
 On the SMP system, doing anything I/O intensive (like a kernel build) quickly
 corrupts the file system - I start to encounter problems like being unable to
 remove entire directory trees because the system thinks that empty directories
 are not *really* empty and therefore cannot be deleted.  Other problems occur,
 too.

Drop to single-user mode and run fsck -fy.  Sometimes fsck will fail
to detect disk corruption at boot time and it will cause problems
later on.

 On the Alpha system, I'm trying to get Xorg to work, with no success.

It's quite possible no-one else has tested this.  alpha is no longer a
tier-1 architecture because of lack of developer interest.

Kris

pgpW0vf0zfoSi.pgp
Description: PGP signature


Re: FreeBSD 5.3 file system troubles

2005-01-26 Thread Kris Kennaway
On Wed, Jan 26, 2005 at 12:42:47PM -0700, Terry R. Friedrichsen wrote:
 
 Thanks for responding to my inquiry.  If it fits into your testing program,
 try running something that works the file system and simply turn off the
 system power in the middle of it.

This is expected if you don't turn off write caching of the hard
disks.  It breaks the softupdates consistency model because data
written to the disk may not actually be written to the disk, so it's
not there following an unexpected power cycle.  Unfortunately write
caching causes a performance hit, and there was a large user backlash
when it was briefly enabled by default some years ago.

Kris

pgpasGjQM9UXz.pgp
Description: PGP signature


re: FreeBSD 5.3 file system troubles

2005-01-26 Thread Terry R. Friedrichsen
I wrote:

 try running something that works the file system and simply turn off the
 system power in the middle of it.

to which [EMAIL PROTECTED] replied:

 This is expected if you don't turn off write caching of the hard
 disks.  It breaks the softupdates consistency model because data
 written to the disk may not actually be written to the disk, so it's
 not there following an unexpected power cycle.  Unfortunately write
 caching causes a performance hit, and there was a large user backlash
 when it was briefly enabled by default some years ago.

What you say is true, but what I'm observing is far worse than simply
missing the last few blocks of output files, etc.

The last time I had to power-cycle the Alpha box (because Xorg hung it),
it rebooted to single-user mode but I couldn't even run sh because
some file in lib was missing.  Or if I *do* get into sh to run fsck,
it finds *hundreds and hundreds* of problems ...

And all of this is on a freshly-installed, bog-standard 5.3 system.

Anyway, I'm going to stop messing about with Xorg on that machine, which
will doubtless make the problem invisible.

And yeah, I know I'm gonna have to stop upgrading my Alpha machines some
day, but I was hoping to get 5.something with an X system running as an
end-of-life position.

The i386 SMP box, though, is another story.  I am going to have to nail
down its problems if I intend to track FreeBSD on it.  If I could suc-
cessfully build a kernel on it, I'd turn off SMP and see how it behaves.

But it appears that I am the only one suffering ...

Thanks for all the responses.

Terry

[EMAIL PROTECTED]
___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]


Re: FreeBSD 5.3 file system troubles

2005-01-26 Thread Kris Kennaway
On Wed, Jan 26, 2005 at 03:53:46PM -0700, Terry R. Friedrichsen wrote:
 I wrote:
 
  try running something that works the file system and simply turn off the
  system power in the middle of it.
 
 to which [EMAIL PROTECTED] replied:
 
  This is expected if you don't turn off write caching of the hard
  disks.  It breaks the softupdates consistency model because data
  written to the disk may not actually be written to the disk, so it's
  not there following an unexpected power cycle.  Unfortunately write
  caching causes a performance hit, and there was a large user backlash
  when it was briefly enabled by default some years ago.
 
 What you say is true, but what I'm observing is far worse than simply
 missing the last few blocks of output files, etc.
 
 The last time I had to power-cycle the Alpha box (because Xorg hung it),
 it rebooted to single-user mode but I couldn't even run sh because
 some file in lib was missing.  Or if I *do* get into sh to run fsck,
 it finds *hundreds and hundreds* of problems ...

Some disks are also known to go crazy and scribble everywhere when
they lose power.

Kris

pgpW8MsGTcxhG.pgp
Description: PGP signature


Re: FreeBSD 5.3 file system troubles

2005-01-26 Thread Robert Watson

On Wed, 26 Jan 2005, Terry R. Friedrichsen wrote:

 Is anybody besides *me* having file system corruption problems with
 FreeBSD 5.3?  I've looked around on several of the mailing lists and
 found no men- tion of this. 
 
 I have two different platforms on which I'm trying to run FreeBSD 5.3. 
 One is an x86 SMP system (dual AMD Athlon 1900+) and the other is an
 Alpha DS-10. 

Could you try setting the following setting in /etc/rc.conf:

background_fsck=NO

Soft updates is supposed to trickle meta-data changes to the disk 'in
order' so that background fsck can make strong assumptions about the
consistency of data even after a crash.  If these assumptions are being
violated -- hardware issues, a bug in the storage driver, gamma radiation
from on high, file system bugs, etc, cascading corruption may be possible.
While there were many reports of this early in bgfsck development, almost
all reports have gone away, with most of the remaining problems being put
down the hardware failure.  However, it could be you've run into one.
Switching to always using foreground fsck should increase the reliability
of the scanning process, and result in an early stop if there's
unrecoverable corruption that fsck can recognize (it's more rigorous and
can handle more failure modes because plain fsck operates under weaker
assumptions).

Since this seems to be a reproduceable problem, the next step if we can
isolate it a bit (and get it caught before catastrophic failure), is to
generate some log information about the nature of the corruption as
reported by fsck.  Typically this is done by reproducing the corruption,
booting to single user mode, and then logging fsck -y output to a memory
disk, booting multi-user, and e-mailing the fsck output to Kirk. :-)  So
try switching to foreground fsck (which will slow the boot process), and
let's see if this prevents nastier corruption.

Begin the process by doing a full manual fsck of all file systems from
single-user mode to make sure we start out in a known good state.  Don't
use -p, as that will force the fsck to really look, not assume the
clean flag is right.

Thanks,

Robert N M Watson


 
 On the SMP system, doing anything I/O intensive (like a kernel build) quickly
 corrupts the file system - I start to encounter problems like being unable to
 remove entire directory trees because the system thinks that empty directories
 are not *really* empty and therefore cannot be deleted.  Other problems occur,
 too.
 
 On the Alpha system, I'm trying to get Xorg to work, with no success.  What
 normally happens is that the system locks up *totally* either when trying to
 configure X or when running the X server after configure generates a config
 file (I'm trying multiple versions of Xorg).
 
 The lockup means that I have to power-cycle the system to reboot.  When I do
 this, the filesystem is *always* horribly damaged.  I finally gave up when I
 couldn't even get into sh in single-abuser mode because /libexec/ld.so.1
 was no longer there ...
 
 What I'm going to try next is pulling one CPU out of the SMP system to see if
 that helps.  On the Alpha, I'm just going to give up on Xorg for a while.
 
 I'd hate to have to drop back to 4.10 or 4.11 ...
 
 If anyone has any suggestions, or even just sympathetic words, I'd be happy
 to hear them!
 
 Thanks.
 
 Terry R. Friedrichsen
 
 [EMAIL PROTECTED]
 ___
 freebsd-questions@freebsd.org mailing list
 http://lists.freebsd.org/mailman/listinfo/freebsd-questions
 To unsubscribe, send any mail to [EMAIL PROTECTED]
 

___
freebsd-questions@freebsd.org mailing list
http://lists.freebsd.org/mailman/listinfo/freebsd-questions
To unsubscribe, send any mail to [EMAIL PROTECTED]