Bug#310243: kernel-image-2.6.8-powerpc: xfs data corruption

2005-06-04 Thread Luis M
On 6/4/05, Joerg Rossdeutscher [EMAIL PROTECTED] wrote:
 Hi,
 
 I found your bug report in the debian bugtracker:
 #310243 kernel-image-2.6.8-powerpc: xfs data corruption
 
 I have the same problem with this kernel version on an amd64 ubuntu.
 Have you found a way to fix it?
 
 My machine is a IMAP/SMTP/POP3 server and crashes only about once a
 month, so it's very hard to debug.
 
 TIA,
 
 bye, Ratti

Unfortunately I have not found a solution to this problem. I'm not
sure if the problem is because the drive is bad (which could be), a
limitation of the Mac's BIOS (after 19GB of data it gets corrupted) or
what.
I believe that I have rule out the possibility that XFS had anything
to do with it. This is because I have reformatted the drive using
various other filesystems (I even wrote zeros to the whole drive usind
dd and /dev/zero and then reformatted the drive using badblocks and
ext3). No matter what I did, the drive always went back to be
corrupted after writing a given number of gigs (copying UTF8 encoded
filenames from another Macintosh running MacOS X. I'm not sure if this
is why this happens).

I have since removed the drive from the computer. I'll put it back in
a x86 system and see if it works. That way I will rule out whether
it's the drive that has problems or the BIOS of the Powerpc system
that has limitations. Though that drive is 40GB an there is already
another 40GB drive in that same computer (this was set as a slave in
the same IDE bus).

Feel free to lower this bug report's level to low or less until I
can find out what the problem is


-- 
)(- 
Luis M
System Administrator
Kiskeyix.org 

We think basically you watch television to turn your brain off, and
you work on your computer when you want to turn your brain on --
Steve Jobs in an interview for MacWorld Magazine 2004-Feb

No .doc: http://www.fsf.org/philosophy/no-word-attachments.es.html



Bug#310243: kernel-image-2.6.8-powerpc: xfs data corruption

2005-05-22 Thread Luis Mondesi
Package: kernel-image-2.6.8-powerpc
Version: 2.6.8-12
Severity: normal

When using rsync to copy data from a Macintosh running Mac OS X to one
running Debian Sarge on an XFS drive, the system reports XFS data
corruption. The following is a collection of all the information i can
collect from the various log files and dmesg or stdout. This happens all
the time. When copying data from another linux system nothing happens
and data gets copied correctly. When copying the data from the Mac to a
Linux system running a 2.6.12 kernel on x86, the data gets copied
correctly.
Things to keep in mind: the Mac OS X file names are encoded using UTF8
and have non-ASCII chacters in them (like n~ or an accented vowels).

dump
Filesystem hdd2: XFS internal error xfs_iformat(7) at line 552 of file
fs/xfs/xfs_inode.c.  Caller 0xe2435dd8
Call trace:
[c000ba5c] dump_stack+0x18/0x28
[e242d788] xfs_error_report+0x60/0x64 [xfs]
[e2434a10] xfs_iformat+0xb4/0x444 [xfs]
[e2435dd8] xfs_iread+0x15c/0x1b8 [xfs]
[e2433488] xfs_iget_core+0xbc/0x4f0 [xfs]
[e2433a04] xfs_iget+0x148/0x180 [xfs]
[e244bd24] xfs_trans_iget+0xc4/0x154 [xfs]
[e2435fe8] xfs_ialloc+0x9c/0x414 [xfs]
[e244c810] xfs_dir_ialloc+0x70/0x280 [xfs]
[e2452a08] xfs_mkdir+0x258/0x650 [xfs]
[e245d7a4] linvfs_mknod+0x2c4/0x370 [xfs]
[c006fd4c] vfs_mkdir+0x8c/0xd0
[c006fe3c] sys_mkdir+0xac/0x100
[c0007d30] ret_from_syscall+0x0/0x4c
xfs_force_shutdown(hdd2,0x8) called from line 1088 of file
fs/xfs/xfs_trans.c.  Return address = 0xe2460d54
Filesystem hdd2: Corruption of in-memory data detected.
Shutting down filesystem: hdd2
Please umount the filesystem, and rectify the problem(s)
end dump

When i umount the filesystem, I have to run xfs_repair -L to get it back
And some files are missing... This is a serious problem.

This is what i was doing when the problem was trigger:

dump
rsync -e ssh -Pauvz server:/path/to/data .
...
Fania All-Stars/Unknown Album/
Fania Allstars  Ismael Mirand/
Fania Allstars  Ismael Mirand/Unknown Album/
rsync: write failed on
/home/Shared2/Music/01-dmx-we_right_here_(dirty)-e/Unknown
Album/dmx_-_the_great_depression_-_05_-_we_right_here.mp3: Input/output
error (5)
rsync: failed to set permissions on
/home/Shared2/Music/01-dmx-we_right_here_(dirty)-e/Unknown
Album/.dmx_-_the_great_depression_-_05_-_we_right_here.mp3.e4qX2G:
Input/output error (5)
rsync: rename
/home/Shared2/Music/01-dmx-we_right_here_(dirty)-e/Unknown
Album/.dmx_-_the_great_depression_-_05_-_we_right_here.mp3.e4qX2G -
01-dmx-we_right_here_(dirty)-e/Unknown
Album/dmx_-_the_great_depression_-_05_-_we_right_here.mp3: Input/output
error (5)
enddump

This is the output for xfs_repair -L

dump
$ sudo xfs_repair -L /dev/hdd2
Phase 1 - find and verify superblock...
Phase 2 - using internal log
- zero log...
ALERT: The filesystem has valuable metadata changes in a log
which is being
destroyed because the -L option was used.
- scan filesystem freespace and inode maps...
- found root inode chunk
Phase 3 - for each AG...
- scan and clear agi unlinked lists...
- process known inodes and
perform inode discovery...
- agno = 0
- agno = 1
- agno
= 2
bad
inode
format
in
inode
33554604
bad
inode
format
in
inode
33554604
cleared
inode
33554604
- agno
=
3
data
fork
in
regular
inode
59788417
claims
used
block
4638076
bad
data
fork
in
inode
59788417
cleared
inode
59788417
- agno
=
4
- agno
=
5
- agno
=
6
- agno
=
7
- agno
=
8
- agno
=
9
- agno
=
10
- agno
=
11
- agno
=
12
- agno
=
13
- agno
=
14
- agno
=
15
- process
newly
discovered
inodes...
Phase
4
- check
for
duplicate
blocks...
- setting
up
duplicate
extent
list...
- clear
lost+found
(if
it
exists)
...
- check
for
inodes
claiming
duplicate
blocks...
- agno
=
0
- agno
=
1
- agno
=
2
- agno
=
3
entry
the_village.vob
in
shortform
directory
59788416
references
free
inode
59788417
junking
entry
the_village.vob
in
directory
inode
59788416
- agno
=
4
- agno
=
5
- agno
=
6
- agno
=
7
- agno
=
8
- agno
=
9
- agno
=
10
- agno
=
11
- agno
=
12
- agno
=
13
- agno
=
14
- agno
=
15
Phase
5
- rebuild
AG
headers
and
trees...
- reset
superblock...
Phase
6
- check
inode
connectivity...
- resetting
contents
of
realtime
bitmap
and
summary
inodes
- ensuring
existence
of
lost+found
directory
- traversing
filesystem
starting
at
/
... 
- traversal
finished
... 
- traversing
all
unattached
subtrees
... 
- traversals
finished
... 
- moving
disconnected
inodes
to
lost+found
... 
Phase
7
- verify
and
correct
link
counts...
done

enddump

I'll dump the files to a different disk and reformat to ext3 and retest.

Could be a hardware issue, though after googling and finding many fixes
to the kernel since version 2.6.8 up to 2.6.12 I believe I have hit some
known bug.

-- System Information:
Debian Release: 3.1
APT prefers testing
APT policy: (500, 'testing')
Architecture: powerpc (ppc)
Kernel: Linux 2.6.8-powerpc
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8) (ignored: LC_ALL 
set to en_US.UTF-8)

Versions of packages kernel-image-2.6.8-powerpc depends on:
ii