Hi all,

Remember that the timing CPU should _not_ be used for any performance-relatated 
experiments. Stick to the in-order and out-of-order CPUs for any such use-cases.

In general I would also expect less issues with the classic memory system 
(especially with full system), and it does a fine job at modelling 
crossbar-based many-core systems. At the moment it does not support X86 out of 
the box, but it may still be worth considering if you’re having issues with 
Ruby.

Andreas

From: gem5-users 
<[email protected]<mailto:[email protected]>> on behalf of 
Ruohuang Zheng <[email protected]<mailto:[email protected]>>
Reply-To: gem5 users mailing list 
<[email protected]<mailto:[email protected]>>
Date: Tuesday, 16 August 2016 at 02:46
To: gem5 users mailing list <[email protected]<mailto:[email protected]>>
Subject: Re: [gem5-users] FW: Gem5-Ruby_HBM_Parsec_Deadlock

Hi,

You can try running the benchmarks in timing CPU instead of detailed to see if 
it works. As far as I know, there are various bugs in Ruby and using detailed 
CPU makes the bugs more likely to be exposed.

On Mon, Aug 15, 2016 at 4:37 PM, George Mappouras 
<[email protected]<mailto:[email protected]>> wrote:
Thanks for the reply. No I do not use checkpoints. I am aware of the checkpoint 
problem (found that out the hard way XD). I run full system from start to end 
and running one of the parsec benchmarks each time with small size input (I do 
have multiple machines running in parallel).

George

________________________________
From: [email protected]<mailto:[email protected]>
Date: Mon, 15 Aug 2016 16:07:39 -0700

To: [email protected]<mailto:[email protected]>
Subject: Re: [gem5-users] FW: Gem5-Ruby_HBM_Parsec_Deadlock

Hi,

Are you taking checkpoints? If yes then getting a deadlock is normal

On Mon, Aug 15, 2016 at 2:28 PM, George Mappouras 
<[email protected]<mailto:[email protected]>> wrote:
Hi Jason,

Thanks for the suggestions. I use MESI_Two_Level and I also compliled gem5 for 
that protocol like this:
scons RUBY=TRUE PROTOCOL=MESI_Two_Level build/X86/gem5.opt -j8

"The system you're simulating is quite a stress test for the Ruby protocol 
you're using! "
Why are you saying that? Could you give me some inside of why MESI could make 
my system slower comparing to other protocols? What would you suggest me to use?

George

________________________________
From: [email protected]<mailto:[email protected]>
Date: Mon, 15 Aug 2016 17:01:47 +0000
To: [email protected]<mailto:[email protected]>
Subject: Re: [gem5-users] FW: Gem5-Ruby_HBM_Parsec_Deadlock


Hi George,

The system you're simulating is quite a stress test for the Ruby protocol 
you're using! What protocol have you compiled?

The problem you're running into could be very simple. It's possible that due to 
the high bandwidth of the system, some of the queues in Ruby are filling up and 
causing the average memory access latency to skyrocket due to queuing delays. 
If this happens, the protocol could be "correct" but still cause a deadlock 
detection. In this case, you may be able to increase the deadlock threshold and 
see the application start to work again. We often see this with GPU workloads.

However, it's more likely a bug somewhere in the protocol you're using. To 
debug this, you'll need to dig into the protocol. The debug flag 
"ProtocolTrace" is useful here. With this debug flag you'll see every 
transition in Ruby. With this information you should be able to trace back and 
find the memory operation that's causing the deadlock. I would also suggest 
using "--debug-start=<tick>" and pick the highest tick value you can before the 
offending operation (e.g., a little less than 5676227351000). Otherwise the 
trace may be 10s-100s of GB (and take days to generate).

Hopefully this helps you get on the right track. Good luck!

Jason

On Wed, Aug 10, 2016 at 6:50 PM George Mappouras 
<[email protected]<mailto:[email protected]>> wrote:
Hi all,

I had some trouble while running Parsec benchmarks with gem5 + Ruby (using MESI 
two level protocol). I found out that some of the benchmarks will cause gem5 to 
crush because a deadlock was detected. The configuration I use is the follow:

I have 8 nodes connected to a ring. Each node is a core connected with a 
private 64KB L1 cache and one channel of High Bandwidth Memory (HBM). Also each 
core has one out of 8 banks of the shared 8MB L2 cache connected to them. The 
command I run looks like this:

 ./build/X86/gem5.opt configs/example/fs.py --disk-image=x86root-parsec.img 
--kernel=x86_64-vmlinux-2.6.22.9.smp -n 8 --cpu-type=detailed --cpu-clock=1GHz 
--caches --l1d_size=64kB --num-l2caches=8 --l2_size=8MB 
--mem-type=HBM_1000_4H_x128 --mem-channels=8 --mem-size=2GB --ruby --num-dirs=8 
--topology=Torus --mesh-rows=1 --access-backing-store 
--script=a_parsec_script.sh

I use the latest version of gem5 and I have no problem booting or running 
commands on the simulated machine. However as i mentioned above some benchmarks 
cause gem5 to crush with a message like this:

panic: Possible Deadlock detected. Aborting!
version: 1 request.paddr: 0x53d6a000 m_writeRequestTable: 1 current time: 
5677053833000 issue_time: 5676227351000 difference: 826482000
 @ tick 5677053833000
[wakeup:build/X86/mem/ruby/system/Sequencer.cc, line 119]
Memory Usage: 5799824 KBytes
Program aborted at tick 5677053833000
--- BEGIN LIBC BACKTRACE ---
./build/X86/gem5.opt(_Z15print_backtracev+0x15)[0x9f8275]
./build/X86/gem5.opt(_Z12abortHandleri+0x36)[0xa09536]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x10330)[0x7f3dd5cdf330]
/lib/x86_64-linux-gnu/libc.so.6(gsignal+0x37)[0x7f3dd4530c37]
/lib/x86_64-linux-gnu/libc.so.6(abort+0x148)[0x7f3dd4534028]
./build/X86/gem5.opt(_Z15__exit_epilogueiPKcS0_iS0_+0x1ec)[0x9bba6c]
./build/X86/gem5.opt(_Z14__exit_messageIIjmmmmmEEvPKciS1_S1_iS1_DpRKT_+0xc7)[0x93ca27]
./build/X86/gem5.opt(_ZN9Sequencer6wakeupEv+0x266)[0x93a936]
./build/X86/gem5.opt(_ZN10EventQueue10serviceOneEv+0xb1)[0xa018e1]
./build/X86/gem5.opt(_Z9doSimLoopP10EventQueue+0x38)[0xa22938]
./build/X86/gem5.opt(_Z8simulatem+0x1fb)[0xa22ebb]
./build/X86/gem5.opt[0x969d7c]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x45f7)[0x7f3dd58f7af7]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x80d)[0x7f3dd58f954d]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x48d8)[0x7f3dd58f7dd8]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4b59)[0x7f3dd58f8059]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4b59)[0x7f3dd58f8059]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x80d)[0x7f3dd58f954d]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f3dd58f9682]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x563e)[0x7f3dd58f8b3e]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x80d)[0x7f3dd58f954d]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x48d8)[0x7f3dd58f7dd8]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x80d)[0x7f3dd58f954d]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f3dd58f9682]
/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0(PyRun_StringFlags+0x79)[0x7f3dd58f34b9]
./build/X86/gem5.opt(_Z6m5MainiPPc+0x5f)[0xa08caf]
./build/X86/gem5.opt(main+0x33)[0x701933]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf5)[0x7f3dd451bf45]
./build/X86/gem5.opt[0x725e83]
--- END LIBC BACKTRACE ---

Anyone can help me figure out what the problem is? Am I missing something? Does 
my system configuration match the command I run? I would appreciate any help!

Thanks,
George

_______________________________________________
gem5-users mailing list
[email protected]<mailto:[email protected]>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users<https://urldefense.proofpoint.com/v2/url?u=http-3A__m5sim.org_cgi-2Dbin_mailman_listinfo_gem5-2Dusers&d=CwMFaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=oOETi_JJQtmsIlNOkjD-zBgYsSwvlT9MCux6ZA-DoD0&m=QGmz4aNiITDmIyB9HAQE4AQ1CxN0rKJ_HDhUQHDNzLE&s=_F9MBBBx2h6jFmhkaoZSmtybPEujq80mthkaRrMLR8o&e=>

_______________________________________________ gem5-users mailing list 
[email protected]<mailto:[email protected]> 
https://urldefense.proofpoint.com/v2/url?u=http-3A__m5sim.org_cgi-2Dbin_mailman_listinfo_gem5-2Dusers&d=CwIGaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=oOETi_JJQtmsIlNOkjD-zBgYsSwvlT9MCux6ZA-DoD0&m=QGmz4aNiITDmIyB9HAQE4AQ1CxN0rKJ_HDhUQHDNzLE&s=_F9MBBBx2h6jFmhkaoZSmtybPEujq80mthkaRrMLR8o&e=

_______________________________________________
gem5-users mailing list
[email protected]<mailto:[email protected]>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users<https://urldefense.proofpoint.com/v2/url?u=http-3A__m5sim.org_cgi-2Dbin_mailman_listinfo_gem5-2Dusers&d=CwMFaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=oOETi_JJQtmsIlNOkjD-zBgYsSwvlT9MCux6ZA-DoD0&m=SgrYiG5YTc70hfhS5naZL8PFSgLUaqAw_yp_z3xGjCA&s=y-ooPUGBSWdig5Q7Ze3poXJWMhGMrPCZeL-TkzByVog&e=>


_______________________________________________ gem5-users mailing list 
[email protected]<mailto:[email protected]> 
https://urldefense.proofpoint.com/v2/url?u=http-3A__m5sim.org_cgi-2Dbin_mailman_listinfo_gem5-2Dusers&d=CwIGaQ&c=imBPVzF25OnBgGmVOlcsiEgHoG1i6YHLR0Sj_gZ4adc&r=oOETi_JJQtmsIlNOkjD-zBgYsSwvlT9MCux6ZA-DoD0&m=SgrYiG5YTc70hfhS5naZL8PFSgLUaqAw_yp_z3xGjCA&s=y-ooPUGBSWdig5Q7Ze3poXJWMhGMrPCZeL-TkzByVog&e=

_______________________________________________
gem5-users mailing list
[email protected]<mailto:[email protected]>
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium. Thank you.
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to