Bug#328740: linux-source-2.6.12: xfs filesystem corruption

2005-10-04 Thread Jean-Luc Coulon (f5ibh)

Christoph,

Maybe I've found an event that triggers the fault.

If I do xfs_fsr (file system optimizer program for XFS) and then  
xfs_check on several filesystems, I've found several times an error  
reported by xfs_check on this filesystem, the message is:


bad nblocks 2 for inode 1026325, counted 3

If I run xfs_repair, after informationnal messages, I get:

...
Phase 3
...

correcting nblocks for inode 1026325
was 2 - counted 3
...

I'm not sure it is always the same inode (probably not but I'm not sure  
I remmebered correctly, the system is runing on a software raid1), but  
it is always just before a filesystem crash if I don't run xfs_repair.


With wfs_repair, I've never the crash then.

I don't know if the problem has to be assigned to the filesystem or  
reassigned to the xfsprogs.



Regards

Jean-Luc



pgpfWPAAhFaVT.pgp
Description: PGP signature


Bug#328740: linux-source-2.6.12: xfs filesystem corruption

2005-09-17 Thread Jean-Luc Coulon (f5ibh)
Package: linux-source-2.6.12
Version: 2.6.12-6
Severity: normal

Hi,

I'm not sure the kernel is responsible of this problem, this can be lvm as well.

I've 2.6.12 with lvm over a software raid1.
All the filesystems are xfs.
The architecture is X86_64 on an athlon 64 3500+. The system has been created
using this architecture.

While leaving gnome, I remarked that the home filesystem was not present
anymore, I had the followwing message in teh syslog:
/
Sep 16 18:28:12 tangerine kernel: Filesystem dm-6: xfs_iflush: detected
corrupt incore inode 1026326, total extents = 1, nblocks = 0, ptr
0x81002367f600
Sep 16 18:28:12 tangerine kernel: xfs_force_shutdown(dm-6,0x8) called from line
3311 of file fs/xfs/xfs_inode.c.  Return address = 0x88118d08
Sep 16 18:28:12 tangerine kernel: Filesystem dm-6: Corruption of in-memory
data detected.  Shutting down filesystem: dm-6
Sep 16 18:28:12 tangerine kernel: Please umount the filesystem, and rectify the
problem(s)
\---

The raid was still running and clean.

I tried xfs_check on the logical volume and the system told me that there was
valuable information on the journal and to do a xfs_repair -L to drop the
jornal. What I did, because it was impossible to mount it.

I got an inode in the lost+found directory, everything else was fine but the
balsa settings directory .balsa.

The system didnt have any crash but a couple of hours before, a program
compilation failed because I ran out of space on an other logical volume. I've
then added dynamically some space on this logical volume and expanded the
matching filesystem. 

I ran memtest86 for 10 hours to be sure (if it is possible to be sure) of the
memory integrity.

The fact I exhausted ne file system might have triggered the problem: I
already have this kind of problem in the past.

Regards

Jean-Luc

-- System Information:
Debian Release: testing/unstable
  APT prefers unstable
  APT policy: (900, 'unstable')
Architecture: amd64 (x86_64)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.12-k8-9
Locale: [EMAIL PROTECTED], [EMAIL PROTECTED] (charmap=ISO-8859-15)

Versions of packages linux-source-2.6.12 depends on:
ii  binutils 2.16.1cvs20050902-1 The GNU assembler, linker and bina
ii  bzip21.0.2-8.1   high-quality block-sorting file co
ii  coreutils [fileutils 5.2.1-2.1   The GNU core utilities

Versions of packages linux-source-2.6.12 recommends:
ii  gcc   4:4.0.1-3  The GNU C compiler
ii  libc6-dev [libc-dev]  2.3.5-6GNU C Library: Development Librari
ii  make  3.80-11The GNU version of the make util

-- no debconf information



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#328740: linux-source-2.6.12: xfs filesystem corruption

2005-09-17 Thread Christoph Hellwig
This looks like a typical corruption caused by not turning off the
write cache on your disks.  Is the write cache on your disk on or
off?



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#328740: linux-source-2.6.12: xfs filesystem corruption

2005-09-17 Thread Christoph Hellwig
On Sat, Sep 17, 2005 at 09:33:06AM +, Jean-Luc Coulon (f5ibh) wrote:
 Le 17.09.2005 11:18:02, Christoph Hellwig a ?crit?:
 This looks like a typical corruption caused by not turning off the
 write cache on your disks.  Is the write cache on your disk on or
 off?
 
 The disks are SATA disks (MAxtom Diamondmax 9, 80GB) and I've not found  
 any way to turn ir on/off via hdparm.

Because SATA is handled by the scsi layer.  What does

sdparm -a your disk device | grep WCE

say?



-- 
To UNSUBSCRIBE, email to [EMAIL PROTECTED]
with a subject of unsubscribe. Trouble? Contact [EMAIL PROTECTED]



Bug#328740: linux-source-2.6.12: xfs filesystem corruption

2005-09-17 Thread Jean-Luc Coulon (f5ibh)

Christoph,

Le 17.09.2005 17:14:28, Christoph Hellwig a écrit :

On Sat, Sep 17, 2005 at 09:33:06AM +, Jean-Luc Coulon (f5ibh)
wrote:
 Le 17.09.2005 11:18:02, Christoph Hellwig a ?crit?:
 This looks like a typical corruption caused by not turning off the
 write cache on your disks.  Is the write cache on your disk on or
 off?

 The disks are SATA disks (MAxtom Diamondmax 9, 80GB) and I've not
found
 any way to turn ir on/off via hdparm.

Because SATA is handled by the scsi layer.  What does

sdparm -a your disk device | grep WCE

say?


[EMAIL PROTECTED] % sdparm /dev/sda | grep WCE
  WCE 1  [ sav:  1]

On both disk on the raid


Regards

Jean-Luc


pgpc1eBDF8UnZ.pgp
Description: PGP signature