Thank you for responding.
I think UEK5 is based on RHEL5 kernel.
Does the problem same as UEK5 arise?
(2011/10/05 1:45), Sunil Mushran wrote:
int sigprocmask(int how, sigset_t *set, sigset_t *oldset)
{
int error;
spin_lock_irq(current-sighand-siglock); CRASH
if (oldset)
*oldset = current-blocked;
...
}
current-sighand is NULL. So definitely a race. Generic kernel issue.
Ping your kernel vendor.
On 10/03/2011 07:49 PM, Hideyasu Kojima wrote:
Hi,
I run ocfs2/drbd active-active 2node cluster.
ocfs2 version is 1.4.7-1
ocfs2-tool version is 1.4.4
Linux version is RHEL 5.4 (2.6.18-164.el5 x86_64)
1 node crash with kernel panic once.
What is the cause?
The bottom is the analysis of vmcore.
Unable to handle kernel NULL pointer dereference at 0808 RIP:
[80064ae6] _spin_lock_irq+0x1/0xb
PGD 187e15067 PUD 187e16067 PMD 0
Oops: 0002 [1] SMP
last sysfs file:
/devices/pci:00/:00:09.0/:06:00.0/:07:00.0/irq
CPU 1
Modules linked in: mptctl mptbase softdog autofs4 ipmi_devintf ipmi_si
ipmi_msghandler ocfs2(U) ocfs2_dlmfs(U) ocfs2_dlm(U)
ocfs2_nodemanager(U) configfs drbd(U) bonding ipv6 xfrm_nalgo crypto_api
bnx2i(U) libiscsi2 scsi_transport_iscsi2 scsi_transport_iscsi cnic(U)
dm_mirror dm_multipath scsi_dh video hwmon backlight sbs i2c_ec i2c_core
button battery asus_acpi acpi_memhotplug ac parport_pc lp parport joydev
sr_mod cdrom sg pcspkr serio_raw hpilo bnx2(U) dm_raid45 dm_message
dm_region_hash dm_log dm_mod dm_mem_cache hpahcisr(PU) ata_piix libata
shpchp cciss sd_mod scsi_mod ext3 jbd uhci_hcd ohci_hcd ehci_hcd
Pid: 21924, comm: res Tainted: P 2.6.18-164.el5 #1
RIP: 0010:[80064ae6] [80064ae6]
_spin_lock_irq+0x1/0xb
RSP: 0018:81008b1cfae0 EFLAGS: 00010002
RAX: 810187af4040 RBX: RCX: 8101342b7b80
RDX: 81008b1cfb98 RSI: 81008b1cfba8 RDI: 0808
RBP: 81008b1cfb98 R08: R09:
R10: 810075463090 R11: 88595b95 R12: 81008b1cfba8
R13: 81007f070520 R14: 0001 R15: 81008b1cfce8
FS: () GS:810105d51840()
knlGS:
CS: 0010 DS: ES: CR0: 8005003b
CR2: 0808 CR3: 000187e14000 CR4: 06e0
Process res (pid: 21924, threadinfo 81008b1ce000, task
810187af4040)
Stack: 8001db30 81007f070520 885961f3
810105d39400
88596323 06ff813231393234 810075463018 810075463018
0297 81007f070520 810075463028 0246
Call Trace:
[8001db30] sigprocmask+0x28/0xdb
[885961f3] :ocfs2:ocfs2_delete_inode+0x0/0x1691
[88596323] :ocfs2:ocfs2_delete_inode+0x130/0x1691
[88581f16] :ocfs2:ocfs2_drop_lock+0x67a/0x77b
[8858026a] :ocfs2:ocfs2_remove_lockres_tracking+0x10/0x45
[885961f3] :ocfs2:ocfs2_delete_inode+0x0/0x1691
[8002f49e] generic_delete_inode+0xc6/0x143
[88595c85] :ocfs2:ocfs2_drop_inode+0xf0/0x161
[8000d46e] dput+0xf6/0x114
[800e9c44] prune_one_dentry+0x66/0x76
[8002e958] prune_dcache+0x10f/0x149
[8004d66e] shrink_dcache_parent+0x1c/0xe1
[80104f8b] proc_flush_task+0x17c/0x1f6
[8008fa2c] sched_exit+0x27/0xb5
[80018024] release_task+0x387/0x3cb
[80015c50] do_exit+0x865/0x911
[80049281] cpuset_exit+0x0/0x88
[8002b080] get_signal_to_deliver+0x42c/0x45a
[8005ae7b] do_notify_resume+0x9c/0x7af
[8008b6a2] deactivate_task+0x28/0x5f
[80021f3f] __up_read+0x19/0x7f
[80066b58] do_page_fault+0x4fe/0x830
[800b65b2] audit_syscall_exit+0x336/0x362
[8005d32e] int_signal+0x12/0x17
Code: f0 ff 0f 0f 88 f3 00 00 00 c3 53 48 89 fb e8 33 f5 02 00 f0
RIP [80064ae6] _spin_lock_irq+0x1/0xb
RSP81008b1cfae0
crash bt
PID: 21924 TASK: 810187af4040 CPU: 1 COMMAND: res
#0 [81008b1cf840] crash_kexec at 800ac5b9
#1 [81008b1cf900] __die at 80065127
#2 [81008b1cf940] do_page_fault at 80066da7
#3 [81008b1cfa30] error_exit at 8005dde9
[exception RIP: _spin_lock_irq+1]
RIP: 80064ae6 RSP: 81008b1cfae0 RFLAGS: 00010002
RAX: 810187af4040 RBX: RCX: 8101342b7b80
RDX: 81008b1cfb98 RSI: 81008b1cfba8 RDI: 0808
RBP: 81008b1cfb98 R8: R9:
R10: 810075463090 R11: 88595b95 R12: 81008b1cfba8
R13: 81007f070520 R14: 0001 R15: 81008b1cfce8
ORIG_RAX: CS: 0010 SS: 0018
#4 [81008b1cfae0] sigprocmask at 8001db30
#5 [81008b1cfb00] ocfs2_delete_inode at 88596323
#6 [81008b1cfbf0] generic_delete_inode at 8002f49e
#7 [81008b1cfc10] ocfs2_drop_inode at 88595c85
#8