Having had some time and feeling patient I have recently gone
thru a few days of Linux IO subsystem and filesystem analysis
and testing, as summarized (with lots of numbers and references)
here:

  http://WWW.sabi.co.UK/Notes/swhwAnno05.html#050906
  http://WWW.sabi.co.UK/Notes/swhwAnno05.html#050907
  http://WWW.sabi.co.UK/Notes/swhwAnno05.html#050908
  http://WWW.sabi.co.UK/Notes/swhwAnno05.html#050909
  http://WWW.sabi.co.UK/Notes/swhwAnno05.html#050910
  http://WWW.sabi.co.UK/Notes/swhwAnno05.html#050911
  http://WWW.sabi.co.UK/Notes/swhwAnno05.html#050912
  http://WWW.sabi.co.UK/Notes/swhwAnno05.html#050913
  http://WWW.sabi.co.UK/Notes/swhwAnno05.html#050914

and as a result of such tests I have decided to switch my
filesystems to JFS to see how much performance degrades with
time and files get deleted/added/rewritten.

The good news is that JFS seemed to me one of the two (with
'ext3' with 1KiB blocks) most desirable choices, and it has
turned out to have some unexpected boons too.

The bad news is that I have already suffered from several
crashes and one bizarre performance problem... My setup
consists of an Athlon Xp 2000+, 512MB, 2x80GB and 2x160GB hard
discs, running a mainline 2.6.13 kernel, with 1.1.18 'jfsprogs'.

The incidents so far:

* Some of my tests were tree traversals, that generate a flood of
  inode updates because, which hit the journal hard. So I wondered
  what would the timings be with '-o noatime', unfortunately I
  got a crash because of that.

* When converting from 'ext3' to JFS file systems, I did this by
  copying things around, and I got a couple of lockups. It may
  be that these were related to high buffer cache traffic (I was
  doing a large 'dd' between partitions at one time) and races
  thereof.

* When restoring a '.tar.bz2' held on a 'vfat' file system to a
  newly formatted 'jfs' one I got a dtree corruption, with no
  device errors. I 'fsck'ed it to fix that and redid the restore
  and it did not happen again. There was again a 'dd' between
  two partitions running at the same time.

* Making a file system with a 30MiB log instead of the default
  32MiB makes reading it with 'tar' over twice as slow. This for
  the same partition on the same hard disc with the same content
  freshly loaded (it was so strange I checked several times).

All which leads to think that not many people have used non
default log sizes, or used JFS with FAT32 or massive 'dd'ing, or
with 'noatime'... :-)

Some more context and some data... I was in multiuser but not
GUI mode when the incidents above happened, with only a few
dæmons running.

The output of 'jfs_fsck' after the «DT_GETPAGE: dtree page
corrupt» errors:

----------------------------------------------------------------
jfs_fsck version 1.1.8, 03-May-2005
processing started: 9/14/2005 18.52.55
Using default parameter: -p
The current device is:  /dev/hdb11
Block size in bytes:  4096
Filesystem size in blocks:  1028152
**Phase 0 - Replay Journal Log
**Phase 1 - Check Blocks, Files/Directories, and  Directory Entries
**Phase 2 - Count links
Incorrect link counts have been detected. Will correct.
**Phase 3 - Duplicate Block Rescan and Directory Connectedness
**Phase 4 - Report Problems
File system object DF20499 is linked as: /var
cannot repair the data format error(s) in this directory.
cannot repair DF20499.  Will release.
File system object DF20512 is linked as: /dev/ida
cannot repair DF20512.  Will release.
**Phase 5 - Check Connectivity
**Phase 6 - Perform Approved Corrections
768 files reconnected to /lost+found/.
**Phase 7 - Rebuild File/Directory Allocation Maps
**Phase 8 - Rebuild Disk Allocation Maps
  4112608 kilobytes total disk space.
    35465 kilobytes in 13747 directories.
  2756001 kilobytes in 135907 user files.
        0 kilobytes in extended attributes
   100356 kilobytes reserved for system use.
  1291716 kilobytes are available for use.
Filesystem is clean.
----------------------------------------------------------------

The one ''oops'' that got logged (it happened twice):

----------------------------------------------------------------
Unable to handle kernel paging request at virtual address cc05b9a4
 printing eip:
c0251f5d
*pde = 00030067
*pte = 0c05b000
Oops: 0000 [#1]
DEBUG_PAGEALLOC
Modules linked in: binfmt_misc snd_cmipci snd_opl3_lib snd_hwdep snd_seq_oss 
snd_seq_midi snd_seq_midi_event snd_seq snd_via82xx gameport snd_ac97_codec 
snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_mpu401_uart 
snd_rawmidi snd_seq_device snd soundcore 3c59x mii parport_pc lp parport video 
thermal processor fan container button battery ac it87 eeprom i2c_sensor 
i2c_isa i2c_dev i2c_core ntfs nls_iso8859_1 nls_cp437 sg sr_mod ide_scsi 
scsi_mod 8250 serial_core nvram rtc
CPU:    0
EIP:    0060:[txUpdateMap+333/656]    Not tainted VLI
EFLAGS: 00010246   (2.6.13p) 
EIP is at txUpdateMap+0x14d/0x290
eax: cc05b97c   ebx: e0996990   ecx: e08366c8   edx: 00000900
esi: 00000001   edi: e0996980   ebp: dfdc7f48   esp: dfdc7f10
ds: 007b   es: 007b   ss: 0068
Process jfsCommit (pid: 139, threadinfo=dfdc7000 task=c15725d0)
Stack: e084be30 0000060c dfdc7f48 c024f181 00000000 00000040 d94596fc dbefc2fc 
       00000202 00000000 00000000 dc64d160 e08366c8 e08366c8 dfdc7f74 c02529b2 
       e08366c8 00000286 e0861514 dfdc7fe4 00000000 0000007b 0000007b dc64d160 
Call Trace:
 [show_stack+127/160] show_stack+0x7f/0xa0
 [show_registers+343/448] show_registers+0x157/0x1c0
 [die+332/688] die+0x14c/0x2b0
 [do_page_fault+921/1791] do_page_fault+0x399/0x6ff
 [error_code+79/84] error_code+0x4f/0x54
 [txLazyCommit+34/688] txLazyCommit+0x22/0x2b0
 [jfs_lazycommit+844/1200] jfs_lazycommit+0x34c/0x4b0
 [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10
Code: f6 47 04 02 0f 85 4f 01 00 00 8d 5f 10 0f b6 43 03 85 c0 74 4d 89 c6 8d 
b4 26 00 00 00 00 f6 43 04 f0 0f 85 16 01 00 00 8b 47 0c <0f> b7 40 28 25 00 f0 
00 00 3d 00 40 00 00 0f 84 ef 00 00 00 8b 
----------------------------------------------------------------
Unable to handle kernel paging request at virtual address c2cc8804
 printing eip:
c0251f5d
*pde = 0000b067
*pte = 02cc8000
Oops: 0000 [#1]
DEBUG_PAGEALLOC
Modules linked in: videodev loop binfmt_misc snd_cmipci snd_opl3_lib snd_hwdep 
snd_seq_oss snd_seq_midi snd_seq_midi_event snd_seq snd_via82xx gameport 
snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc 
snd_mpu401_uart snd_rawmidi snd_seq_device snd soundcore 3c59x mii parport_pc 
lp parport video thermal processor fan container button battery ac it87 eeprom 
i2c_sensor i2c_isa i2c_dev i2c_core ntfs nls_iso8859_1 nls_cp437 sg sr_mod 
ide_scsi scsi_mod 8250 serial_core nvram rtc
CPU:    0
EIP:    0060:[txUpdateMap+333/656]    Not tainted VLI
EFLAGS: 00010246   (2.6.13p) 
EIP is at txUpdateMap+0x14d/0x290
eax: c2cc87dc   ebx: e08b6b10   ecx: e0828700   edx: 00000900
esi: 00000001   edi: e08b6b00   ebp: dfdc7f48   esp: dfdc7f10
ds: 007b   es: 007b   ss: 0068
Process jfsCommit (pid: 139, threadinfo=dfdc7000 task=c15725d0)
Stack: e0879094 00000b85 dfdc7f48 c024f181 00000000 00000040 db26571c d9c48cfc 
       00000206 00000000 00000000 d3ea84c0 e0828700 e0828700 dfdc7f74 c02529b2 
       e0828700 00000000 b2f1fb80 00989dee c04d0c60 c157270c dfdc7000 d3ea84c0 
Call Trace:
 [show_stack+127/160] show_stack+0x7f/0xa0
 [show_registers+343/448] show_registers+0x157/0x1c0
 [die+332/688] die+0x14c/0x2b0
 [do_page_fault+921/1791] do_page_fault+0x399/0x6ff
 [error_code+79/84] error_code+0x4f/0x54
 [txLazyCommit+34/688] txLazyCommit+0x22/0x2b0
 [jfs_lazycommit+844/1200] jfs_lazycommit+0x34c/0x4b0
 [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10
Code: f6 47 04 02 0f 85 4f 01 00 00 8d 5f 10 0f b6 43 03 85 c0 74 4d 89 c6 8d 
b4 26 00 00 00 00 f6 43 04 f0 0f 85 16 01 00 00 8b 47 0c <0f> b7 40 28 25 00 f0 
00 00 3d 00 40 00 00 0f 84 ef 00 00 00 8b 
----------------------------------------------------------------

Sorry for the relative lack of details, I hope that there is
enough to start an investigation.



-------------------------------------------------------
SF.Net email is sponsored by:
Tame your development challenges with Apache's Geronimo App Server. Download
it for free - -and be entered to win a 42" plasma tv or your very own
Sony(tm)PSP.  Click here to play: http://sourceforge.net/geronimo.php
_______________________________________________
Jfs-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jfs-discussion

Reply via email to