Hey Stephane,

Sorry for not following up on this sooner - here is a bit more info on
my own debugging process

The dmesg dump was when using the "low latency desktop" option for
preempt but with preemptable RCU turned off.  By moving to "server"( no
preempt)
I am still able to lock up the machine occasionally - though it happens
far less frequently.  When it does happen there appears to be no dmesg
output (or things are locking differently that oopses to a state
where i can't dmesg)

Moving to system-wide profiling instead of thread level appears to make
the lockups disappear completely both with preempt on and off.  I
hammered the machine over the weekend and not a single lockup
when using system-wide profiling vs every 6-10 runs when using
preemption and every 10-20 maybe when preempt is off.

Sorry this is so anecdotal - let me know if there is something I can do
to provide more systematic testing for you.

cheers
-dave


On Tue, 2009-11-10 at 16:32 +0100, stephane eranian wrote:

> David,
> 
> 
> 
> Could you try recompiling your kernel without preemption turned on?
> 
> 
> 
> On Sun, Nov 8, 2009 at 12:09 AM, David Nellans <dnell...@cs.utah.edu>
> wrote:
> 
>         I've been able to replicate what seems to be a race condition
>         based bug.  I can run many traces without a crash, and
>         sometimes it will crash on the first attempt.
>         
>         I've attached a log of the dmesg output of what is happening.
>         Please let me know if there is anything else I can provide
>         that might help diagnose.
>         
>         ----------------------------------------
>         From: Manu Awasthi <manu.awas...@gmail.com>
>         Date: Tue, Nov 3, 2009 at 10:31 AM
>         Subject: kernel panic with monitoring DRAM events
>         To: perfmon2-devel@lists.sourceforge.net
>         
>         
>         Hi all,
>         I have been measuring memory events for the parsec benchmark
>         suite on a dual socket, quad-core opteron machine with pfmlib
>         version 3.9, kernel pfmon version 2.82 and kernel version
>         2.6.29.6 . This is what I use as my commandline option:
>         
>         >> pfmon --with-header --outfile=test1 --verbose -u
>         --switch-timeout=100
>         
> -eDRAM_ACCESSES_PAGE:HIT,DRAM_ACCESSES_PAGE:MISS,DRAM_ACCESSES_PAGE:CONFLICT,DRAM_ACCESSES_PAGE:ALL
>  
> -eDRAM_ACCESSES_PAGE:DCT1_PAGE_HIT,DRAM_ACCESSES_PAGE:DCT1_PAGE_MISS,DRAM_ACCESSES_PAGE:DCT1_PAGE_CONFLICT,DRAM_ACCESSES_PAGE:ALL
>    $PASEC_COMMAND
>         
>         The problem is, sometimes, over different runs of the same
>         (multi-threaded) benchmark, the kernel panics and the machine
>         freezes up. has anybody ever experienced something of this
>         sort before? Or is there something that I am doing wrong? Is
>         there a better way to measure these stats (system-wide
>         monitoring?)?
>         
>         Any help is appreciated.
>         
>         Thanks,
>         Manu
>         
>         
>         -----------------------------
>         David W Nellans
>         dnell...@cs.utah.edu
>         
>         
>         
> ------------------------------------------------------------------------------
>         Let Crystal Reports handle the reporting - Free Crystal
>         Reports 2008 30-Day
>         trial. Simplify your report design, integration and deployment
>         - and focus on
>         what you do best, core application coding. Discover what's new
>         with
>         Crystal Reports now.  http://p.sf.net/sfu/bobj-july
>         _______________________________________________
>         perfmon2-devel mailing list
>         perfmon2-devel@lists.sourceforge.net
>         https://lists.sourceforge.net/lists/listinfo/perfmon2-devel
>         
> 
> 
> 


-----------------------------
David W Nellans
dnell...@cs.utah.edu
------------------------------------------------------------------------------
Let Crystal Reports handle the reporting - Free Crystal Reports 2008 30-Day 
trial. Simplify your report design, integration and deployment - and focus on 
what you do best, core application coding. Discover what's new with
Crystal Reports now.  http://p.sf.net/sfu/bobj-july
_______________________________________________
perfmon2-devel mailing list
perfmon2-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/perfmon2-devel

Reply via email to