On 05/05/2011 04:23 PM, theHog wrote:
> Hello,
>
> I have a nas with Marvell Kirkwood ARM processor that runs linux
> 2.6.33.2 and 2 SATA disks in RAID 1. I have not partitioned the disks
> but formatted /dev/md0 with jfs and mounted that on /raid. This
> configuration is running for a couple of months now, with regular
> shutdowns (I only power the nas when I am at home).
>
> However, sometimes, when unmounting the filesystem, the JFS filesystem
> crashes:
>
> Unable to handle kernel NULL pointer dereference at virtual address 00000088
> pgd = c0004000
> [00000088] *pgd=00000000
> Internal error: Oops: 17 [#1]
> last sysfs file:
> /sys/devices/platform/orion-ehci.0/usb1/1-1/1-1.3/1-1.3:1.0/usb/hiddev0/dev
> Modules linked in: usb_storage scsi_wait_scan
> CPU: 0    Not tainted  (2.6.33.2 #1)
> pc : [<c0156be8>]    lr : [<c0162260>]    psr: 20000013
> sp : dfbe3e08  ip : dfbe3ea0  fp : dfbe3e9c
> r10: 00000010  r9 : 00000000  r8 : 00000001
> r7 : 00228990  r6 : 00000000  r5 : e085c0a4  r4 : 00000000
> r3 : df91f780  r2 : 00000000  r1 : 00000000  r0 : 00000001
> Flags: nzCv  IRQs on  FIQs on  Mode SVC_32  ISA ARM  Segment kernel
> Control: 0005397f  Table: 1f158000  DAC: 00000017
> Process jfsCommit (pid: 16, stack limit = 0xdfbe2270)
> Stack: (0xdfbe3e08 to 0xdfbe4000)
> 3e00:                   c0306a20 00000001 0000012c 00000080 c03941cc c0306a30
> 3e20: dfbe3e64 dfbe3e30 c02440d4 00000001 00000000 00033d52 c00aa338 dfbe2000
> 3e40: 00000001 0000000c c038e1c8 df86f1d8 df0f0800 00228990 c0181800 c01816dc
> 3e60: 00000093 00000000 00400008 00000000 dfbe3e94 e08b0510 e08b0528 e085c0a4
> 3e80: 00000000 00000001 00000000 00000010 dfbe3f14 dfbe3ea0 c0162260 c0156b58
> 3ea0: 00000001 00000000 e085c0a4 ffffffff 00000001 00000000 c077efa0 00000000
> 3ec0: df86ef30 00002c58 00000000 000f15e8 00000000 df4748a8 00000000 00000000
> 3ee0: df474f38 dfbe3ef0 c0183064 e08b0500 00000000 e08b0510 e085c0a4 df474f38
> 3f00: e08ad000 00000000 dfbe3f64 dfbe3f18 c0162484 c016210c 00000000 00000010
> 3f20: 00000001 00003500 c0096a1c c009691c c02f9ee8 dfbe2000 dfbe3f7c e085c0a4
> 3f40: dfbe2000 c038ffb8 c0390000 df91f780 00100100 df86ef00 dfbe3f7c dfbe3f68
> 3f60: c0164754 c01623c0 60000013 dfbe2000 dfbe3fc4 dfbe3f80 c0164898 c0164730
> 3f80: dfbe3fc4 00000000 df86ef00 c007a4c4 00100100 00200200 dfbe3fc4 df82df20
> 3fa0: dfbe3fcc c016480c 00000000 00000000 00000000 00000000 dfbe3ff4 dfbe3fc8
> 3fc0: c0091940 c016481c 00000000 00000000 dfbe3fd0 dfbe3fd0 00000000 00000000
> 3fe0: 00000000 00000000 00000000 dfbe3ff8 c0080624 c00918cc 12000710 02400c08
> Backtrace:
> Function entered at [<c0156b48>] from [<c0162260>]
> Function entered at [<c01620fc>] from [<c0162484>]
> Function entered at [<c01623b0>] from [<c0164754>]
> Function entered at [<c0164720>] from [<c0164898>]
>    r5:dfbe2000 r4:60000013
> Function entered at [<c016480c>] from [<c0091940>]
> Function entered at [<c00918bc>] from [<c0080624>]
>    r7:00000000 r6:00000000 r5:00000000 r4:00000000
> Code: e5933174 e3a02000 e3a06000 e5934018 (e5949088)
> ---[ end trace 88625182e1256d7b ]---

Without symbols, I can't glean anything from this.  If you can send me 
the System.map (don't cc the list), maybe I can find something useful.

> After rebooting the nas, the jfs filesystem does not mount. I ran the
> jfs.fsck tool to check the filesystem, but it gives a couple of errors:
>
>
> admin@ts219p:~$ sudo jfs.fsck -fv /dev/md0
> jfs.fsck version 1.1.14, 06-Apr-2009
> processing started: 5/5/2011 23.1.20
> The current device is:  /dev/md0
> Open(...READ/WRITE EXCLUSIVE...) returned rc = 0
> Primary superblock is valid.
> The type of file system for the device is JFS.
> Block size in bytes:  4096
> Filesystem size in blocks:  366284356
> **Phase 0 - Replay Journal Log
> LOGREDO:  Log already redone!
> logredo returned rc = 0
> **Phase 1 - Check Blocks, Files/Directories, and  Directory Entries
> Invalid data (43) detected in file system object MA16.
> Primary metadata inode A16 is corrupt.
> Invalid data (43) detected in file system object MA16.
> Secondary metadata inode A16 is corrupt.
> Errors detected in the Primary File/Directory Allocation Table.
> Errors detected in the Secondary File/Directory Allocation Table.

Hmm. Corruption in the inode table metadata, replicated to the secondary 
table.  I've got no idea what may have caused that.

> CANNOT CONTINUE.
> processing terminated:  5/5/2011 23:01:20  with return code: -10049
> exit code: 4.
>
>
> After doing that, the filesystem can be mounted again, but re-running
> fsck does produces the same errors. I am a bit concerned about this.
> Before the filesystems crashed, I've put a 1000 files of ~25 MB each
> on the it.

Since fsck found an error that it couldn't fix, there's a decent chance 
that the kernel bug is triggered by the same problem.  Might be missing 
a sanity check in the kernel driver.

> What could cause (and solve!) this problem? Is JFS on ARM stable?

As far as I know.  I don't have any first-hand experience on ARM.

Thanks,
Shaggy

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network 
management toolset available today.  Delivers lowest initial 
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
Jfs-discussion mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/jfs-discussion

Reply via email to