Re: [Ocfs2-users] Random Crash - Diagnosing

Sunil Mushran Tue, 11 Sep 2007 08:55:21 -0700

Please log a bugzilla with this output alongwith all the version
numbers. Kernel/ocfs2/distro


Matthew E. Porter wrote:

Sunil:
We have seen some similar errors in bugzilla. Specifically, what weare seeing is:
Sep 10 09:15:34 sulu kernel: BUG: soft lockup detected on  CPU#2!
Sep 10 09:15:34 sulu kernel:  [<c0447f3f>] softlockup_tick +0x98/0xa6
Sep 10 09:15:34 sulu kernel: [<c042d138>]update_process_times+0x39/0x5cSep 10 09:15:34 sulu kernel: [<c04176f0>]smp_apic_timer_interrupt+0x5c/0x64Sep 10 09:15:34 sulu kernel: [<c04049bf>]apic_timer_interrupt+0x1f/0x24
Sep 10 09:15:34 sulu kernel:  [<c041c774>] kmap_atomic +0xb5/0xbb
Sep 10 09:15:34 sulu kernel:  [<c046cd92>]  cont_prepare_write+0xd4/0x21d
Sep 10 09:15:34 sulu kernel: [<f8cb12cf>]ocfs2_prepare_write+0x150/0x19d [ocfs2]Sep 10 09:15:34 sulu kernel: [<f8cb06da>] ocfs2_get_block +0x0/0xaa5[ocfs2]Sep 10 09:15:34 sulu kernel: [<c044ecae>]generic_file_buffered_write+0x23f/0x5f1Sep 10 09:15:34 sulu kernel: [<f8856ac0>]do_get_write_access+0x43a/0x467 [jbd]
Sep 10 09:15:34 sulu kernel:  [<c0427f65>] current_fs_time +0x4a/0x55
Sep 10 09:15:34 sulu kernel: [<c044f506>]__generic_file_aio_write_nolock+0x4a6/0x52aSep 10 09:15:34 sulu kernel: [<f8cbebb2>]ocfs2_extend_file+0xf0d/0xf95 [ocfs2]Sep 10 09:15:34 sulu kernel: [<c044f831>]generic_file_aio_write_nolock+0x39/0x83Sep 10 09:15:34 sulu kernel: [<c044fbb4>]generic_file_write_nolock+0x86/0x9aSep 10 09:15:34 sulu kernel: [<f8ccd226>]ocfs2_write_lock_maybe_extend+0xd39/0xe03 [ocfs2]Sep 10 09:15:34 sulu kernel: [<c04352dd>]autoremove_wake_function+0x0/0x2dSep 10 09:15:34 sulu kernel: [<f8cbf190>] ocfs2_file_write+0x189/0x22c [ocfs2]Sep 10 09:15:34 sulu kernel: [<f8cbf007>] ocfs2_file_write +0x0/0x22c[ocfs2]
Sep 10 09:15:34 sulu kernel:  [<c0469af3>] vfs_write +0xa1/0x143
Sep 10 09:15:34 sulu kernel:  [<c046a0e5>] sys_write+0x3c/ 0x63
Sep 10 09:15:34 sulu kernel:  [<c0403eff>] syscall_call +0x7/0xb
This happens on all nodes. The CPU# and timestamp change, but theproblem persists. The systems do not restart or panic. The systemmerely puts every process accessing the OCFS volume in a D state.
Would you still like me to log another bugzilla issue? I am happyto do such if you wish.
Cheers,
  Matthew


---
Matthew E. Porter
Contegix
Beyond Managed Hosting(r) for Your Enterprise



On Sep 7, 2007, at 12:49 PM, Sunil Mushran wrote:
A bugzilla with the oops stack trace will help.

Matthew E. Porter wrote:
Greetings, I am looking for a good way to diagnose random crashesthat are occurring with one of our OCFS clusters. It is a simple 2node cluster. debugfs does not seem to indicate any issues.
(Also, I would be happy to find a consultant/freelancer to workthrough this.)
Cheers,
  Matthew

---
Matthew E. Porter
Contegix
Beyond Managed Hosting(r) for Your Enterprise




_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users



_______________________________________________
Ocfs2-users mailing list
[email protected]
http://oss.oracle.com/mailman/listinfo/ocfs2-users

Re: [Ocfs2-users] Random Crash - Diagnosing

Reply via email to