Hello,

I have been using the perform patch on Xeon E5345 recently and got
multiple CPU lockups that seem to be related to using pipes and maybe
vmsplice. I will try to take some time to reduce my user-space
application and send it here so that you may reproduce. But first, are
you aware of any such problems with the perfmon patch? Then, is there
something supposedly more stable than 2.6.28 + kernel patch 2.6.28-1
from [1] ? I actually reproduced lockups with 2.6.28, 2.6.27 and 2.6.24.

There are 2 backtraces below (first one from 2.6.28, second one from
26.24). They don't occur at the same time in the application. I've seen
other ones containing vmsplice as well. Do they ring any bell?

thanks
Brice

[1]
http://sourceforge.net/project/showfiles.php?group_id=144822&package_id=159787&release_id=662396


[  221.193247] BUG: soft lockup - CPU#5 stuck for 61s!
[  221.193252] Modules linked in: perfmon_intel_core nfs lockd nfs_acl sunrpc 
ipv6 dm_snapshot dm_mirror dm_region_hash dm_log dm_mod loop evdev iTCO_wdt 
psmouse serio_raw pcspkr rng_core i5000_eds
[  221.193252] CPU 5:
[  221.193252] Modules linked in: perfmon_intel_core nfs lockd nfs_acl sunrpc 
ipv6 dm_snapshot dm_mirror dm_region_hash dm_log dm_mod loop evdev iTCO_wdt 
psmouse serio_raw pcspkr rng_core i5000_eds
[  221.193252] Pid: 4314, comm: IMB-MPI1 Not tainted 2.6.28-papi #2
[  221.193252] RIP: 0010:[<ffffffff802374b8>]  [<ffffffff802374b8>] 
finish_task_switch+0x34/0xc4
[  221.193252] RSP: 0018:ffff88012a953a88  EFLAGS: 00000283
[  221.193252] RAX: ffff88012d4bc580 RBX: ffff88012a953aa8 RCX: ffffffff80694468
[  221.193252] RDX: 0000000000000000 RSI: ffff88012d4bc580 RDI: ffff88002806c700
[  221.193252] RBP: ffff88012a953aa8 R08: ffff88012a952000 R09: ffffffff8043d2c1
[  221.193252] R10: ffff88012f6cf968 R11: ffffffffa0317099 R12: ffffffff8032b367
[  221.193252] R13: ffffffff80212cda R14: 0000000000000000 R15: ffff88012d4bc580
[  221.193252] FS:  00007f098e4f76f0(0000) GS:ffff88012fb162c0(0000) 
knlGS:0000000000000000
[  221.193252] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[  221.193252] CR2: 00007f5ddbd399e0 CR3: 000000012bc58000 CR4: 00000000000007e0
[  221.193252] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  221.193252] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[  221.193252] Call Trace:
[  221.193252]  [<ffffffff802374af>] ? finish_task_switch+0x2b/0xc4
[  221.193252]  [<ffffffff8043bc9b>] thread_return+0x3d/0xc7
[  221.193252]  [<ffffffff8029bb0a>] ? alloc_page_vma+0x12d/0x149
[  221.193252]  [<ffffffff80288a06>] ? handle_mm_fault+0x883/0x8be
[  221.193252]  [<ffffffff8043be8f>] schedule_timeout+0x1e/0xad
[  221.193252]  [<ffffffff8022fc4e>] task_rq_lock+0x50/0x89
[  221.193252]  [<ffffffff80221bdd>] default_spin_lock_flags+0x5/0x8
[  221.193252]  [<ffffffff8043d2c1>] _spin_lock_irqsave+0x24/0x2c
[  221.193252]  [<ffffffff80221bdd>] default_spin_lock_flags+0x5/0x8
[  221.193252]  [<ffffffff8043d2c1>] _spin_lock_irqsave+0x24/0x2c
[  221.193252]  [<ffffffff8024d97f>] prepare_to_wait+0x15/0x58
[  221.193252]  [<ffffffff80427007>] unix_stream_recvmsg+0x28d/0x60a
[  221.193252]  [<ffffffff8024d862>] autoremove_wake_function+0x0/0x2e
[  221.193252]  [<ffffffff8027d632>] __alloc_pages_internal+0xd2/0x420
[  221.193252]  [<ffffffff803bf480>] sock_aio_read+0x12c/0x140
[  221.193252]  [<ffffffff8027f9d2>] release_pages+0x1d2/0x1e4
[  221.193252]  [<ffffffff802a6276>] do_sync_read+0xce/0x113
[  221.193252]  [<ffffffff80252d1d>] getnstimeofday+0x38/0x92
[  221.193252]  [<ffffffff8024d862>] autoremove_wake_function+0x0/0x2e
[  221.193252]  [<ffffffff802a6c83>] vfs_read+0xbd/0x153
[  221.193252]  [<ffffffff802a6dd5>] sys_read+0x45/0x6e
[  221.193252]  [<ffffffff8020bfca>] system_call_fastpath+0x16/0x1b



[16989.244119] BUG: soft lockup - CPU#1 stuck for 11s!
[16989.258001] CPU 1:
[16989.262054] Modules linked in: perfmon_intel_core nfs lockd nfs_acl sunrpc 
loop psmouse rtc_cmos i5000_edac pcspkr rtc_core evdev serio_raw rng_core 
rtc_lib edac_core shpchp pci_hotplug button ext3 jbd mbcache sg sr_mod cdrom 
piix generic ide_core sd_mod ata_piix ata_generic libata megaraid_sas ehci_hcd 
uhci_hcd scsi_mod bnx2 zlib_inflate thermal processor fan
[16989.326299] Pid: 10096, comm: IMB-MPI1 Not tainted 2.6.24-papi #2
[16989.339667] RIP: 0010:[<ffffffff8040b243>]  [<ffffffff8040b243>] 
mutex_lock+0x3/0xb
[16989.354964] RSP: 0018:ffff81012a4e1c90  EFLAGS: 00000246
[16989.365562] RAX: ffff81012bc5eb40 RBX: ffff81012a4e1f50 RCX: 0000000000000001
[16989.379795] RDX: 0000000000000001 RSI: 0000000000010000 RDI: ffff81012bc5eb40
[16989.394029] RBP: ffff81012bcf34b8 R08: ffff81012a4e1f50 R09: ffffffff802a6b49
[16989.408262] R10: 0000000000000001 R11: ffffffff802ecc77 R12: ffff81012a4e1ea8
[16989.422496] R13: ffff81012bdf2150 R14: ffff81012bcf34b8 R15: ffff81012a4e1ea8
[16989.436729] FS:  00002ac81052d110(0000) GS:ffff81012fa72940(0000) 
knlGS:0000000000000000
[16989.452881] CS:  0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[16989.464345] CR2: 00002ac812e79088 CR3: 000000012c981000 CR4: 00000000000007e0
[16989.478580] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[16989.492813] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[16989.507047] 
[16989.507048] Call Trace:
[16989.514906]  [<ffffffff802a6bbe>] pipe_read+0x75/0x39f
[16989.525157]  [<ffffffff802a6b49>] pipe_read+0x0/0x39f
[16989.535238]  [<ffffffff802a026a>] do_sync_readv_writev+0xc0/0x107
[16989.547396]  [<ffffffff8029ed74>] get_unused_fd_flags+0x80/0x118
[16989.559386]  [<ffffffff802a9f35>] do_path_lookup+0x1bc/0x212
[16989.570672]  [<ffffffff8024d691>] autoremove_wake_function+0x0/0x2e
[16989.583179]  [<ffffffff802324d2>] __wake_up+0x38/0x4f
[16989.593249]  [<ffffffff802a00f4>] rw_copy_check_uvector+0x6d/0xe4
[16989.605409]  [<ffffffff802a095e>] do_readv_writev+0xce/0x1a5
[16989.616702]  [<ffffffff8029f088>] do_filp_open+0x2d/0x3d
[16989.627302]  [<ffffffff802a620d>] pipe_release+0x7e/0x89
[16989.637895]  [<ffffffff802a0b8e>] sys_readv+0x45/0x93
[16989.647976]  [<ffffffff8020bfce>] system_call+0x7e/0x83



------------------------------------------------------------------------------
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to