Please don't reply to lustre-devel. Instead, comment in Bugzilla by using the 
following link:
https://bugzilla.lustre.org/show_bug.cgi?id=11324

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Severity|S2 (major)                  |S1 (critical)
  Status Whiteboard|2006-12-06: Suspected       |2006-01-06: Original damage
                   |hardware failure; LLNL to   |inflicted by hardware
                   |test updated e2fsk on broken|failure, e2fsck claims to
                   |filesystem                  |have repaired it, OST still
                   |                            |asserts on start.


We continue to hit the following assertions when starting the pigs[33-40] OSTs.
 A few weeks back they suffered damage due to a DDN hardware failure but have
been nothing but trouble since.  This is in spite of running many e2fsck's with
the latest fixes applied and finding no damage in the filesystems.  Yet when we
start the OSTs we immedately hit the following assertion once the recovery
window closes.

We need to diagnose the root cause of this issue immediately to get the
filesystem back online.  This filesystem being down is disrupting all of our OCF
systems and as of now it has been offline since 4pm (~10 hours).

Marking bug Sev 1 until the FS can be brought back online.

-----

2007-01-06 01:21:13 Assertion failure in mb_free_blocks() at
/tmp/root.29834/rpm/BUILD/lustre-1.4.6.95_17.4llnl/lustre/ldiskfs/mballoc.c:771:
"mb_test_bit(block, LDISKFS_MB_BITMAP(e3b))" 2007-01-06 01:21:13 -----------
[cut here ] --------- [please bite here ] ---------
2007-01-06 01:21:13 Kernel BUG at mballoc:771
2007-01-06 01:21:13 invalid operand: 0000 [1] SMP
2007-01-06 01:21:13 CPU 0
2007-01-06 01:21:13 Modules linked in: obdfilter(U) fsfilt_ldiskfs(U) ldiskfs(U)
jbd(U) ost(U) ptlrpc(U) obdclass(U) lvfs(U) ksocklnd(U) lnet(U) libcfs(U)
perfctr(U) i2c_i801(U) netdump(U) e752x_edac(U) edac_mc(U) i2c_dev(U)
i2c_core(U) dm_mod(U) rtc(U) md(U) uhci_hcd(U) ehci_hcd(U) floppy(U) sd_mod(U)
qla2300(U) qla2xxx(U) scsi_transport_fc(U) scsi_mod(U) unionfs(U) nfs(U)
lockd(U) sunrpc(U) e1000(U) 2007-01-06 01:21:13 Pid: 3335, comm: ll_ost_io_29
Not tainted 2.6.9-54chaos
2007-01-06 01:21:13 RIP: 0010:[<ffffffffa03b7dda>]
<ffffffffa03b7dda>{:ldiskfs:mb_free_blocks+295}
2007-01-06 01:21:13 RSP: 0018:000001005d13d588  EFLAGS: 00010212
2007-01-06 01:21:13 RAX: 00000000000000aa RBX: 000001005d13d628 RCX:
000001005b2fc4a8
2007-01-06 01:21:13 RDX: 000001007c55cc01 RSI: 0000000000000246 RDI:
ffffffff803ba240
2007-01-06 01:21:13 RBP: 0000010074cc4e48 R08: ffffffff803ba248 R09:
000001005d13d628
2007-01-06 01:21:13 R10: 0000000100000000 R11: 0000000000000000 R12:
0000000000000037
2007-01-06 01:21:13 R13: 0000000000001930 R14: 0000000000000036 R15:
0000000000001931
2007-01-06 01:21:13 FS:  0000002a95b356e0(0000) GS:ffffffff804e2900(0000)
knlGS:0000000000000000
2007-01-06 01:21:13 CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
2007-01-06 01:21:13 CR2: 0000002a9556c000 CR3: 0000000000101000 CR4:
00000000000006e0
2007-01-06 01:21:13 Process ll_ost_io_29 (pid: 3335, threadinfo
000001005d13c000, task 000001005d137710)
2007-01-06 01:21:13 Stack: 0000000100000000 0000000000003cb7 0000010074cc4e48
0000000000000037
2007-01-06 01:21:13        000000001e5b9930 000001004fedbc00 0000000000003cb7
ffffffffa03ba117
2007-01-06 01:21:13        000000001e5b9966 0000000000001930
2007-01-06 01:21:13 Call
Trace:<ffffffffa03ba117>{:ldiskfs:ldiskfs_mb_free_blocks+903}
2007-01-06 01:21:13        <ffffffffa03bbdbd>{:ldiskfs:ldiskfs_free_blocks+61}
2007-01-06 01:21:13        
<ffffffffa03b678e>{:ldiskfs:ldiskfs_remove_blocks+264}
2007-01-06 01:21:13       
<ffffffffa03b629f>{:ldiskfs:ldiskfs_ext_remove_space+1223}
2007-01-06 01:21:13        
<ffffffffa03a6b5b>{:ldiskfs:ldiskfs_mark_inode_dirty+65}
2007-01-06 01:21:13        <ffffffffa03b6e42>{:ldiskfs:ldiskfs_ext_truncate+324}
2007-01-06 01:21:13        <ffffffffa03a8133>{:ldiskfs:ldiskfs_truncate+268}
<ffffffff8017a413>{__getblk+42}
2007-01-06 01:21:13        <ffffffff80165a11>{unmap_mapping_range+339}
<ffffffff80165abc>{vmtruncate+162}
2007-01-06 01:21:13        <ffffffff8018f479>{inode_setattr+54}
<ffffffffa03a7bc4>{:ldiskfs:ldiskfs_setattr+296}
2007-01-06 01:21:13       
<ffffffffa03d69d4>{:fsfilt_ldiskfs:fsfilt_ldiskfs_setattr+325}
2007-01-06 01:21:13        <ffffffffa03ef19a>{:obdfilter:filter_destroy+1823}
2007-01-06 01:21:13        <ffffffffa02e84aa>{:ptlrpc:lustre_pack_reply+1817}
2007-01-06 01:21:13        <ffffffffa037aaa7>{:ost:ost_handle+4322}
<ffffffffa01db1bb>{:libcfs:libcfs_debug_msg+1558}
2007-01-06 01:21:13        <ffffffff801e2b25>{vsnprintf+848}
<ffffffff801e2e36>{snprintf+131}
2007-01-06 01:21:13       
<ffffffffa02ed469>{:ptlrpc:ptlrpc_server_handle_request+2714}
2007-01-06 01:21:13        <ffffffff8013bb5c>{__mod_timer+293}
<ffffffffa02ee5f3>{:ptlrpc:ptlrpc_main+2127}
2007-01-06 01:21:13        <ffffffff8012f92c>{default_wake_function+0}
<ffffffffa02edd97>{:ptlrpc:ptlrpc_retry_rqbds+0}
2007-01-06 01:21:13        <ffffffffa02edd97>{:ptlrpc:ptlrpc_retry_rqbds+0}
<ffffffff8010ff3f>{child_rip+8}
2007-01-06 01:21:13        <ffffffffa02edda4>{:ptlrpc:ptlrpc_main+0}
<ffffffff8010ff37>{child_rip+0}

_______________________________________________
Lustre-devel mailing list
[email protected]
https://mail.clusterfs.com/mailman/listinfo/lustre-devel

Reply via email to