Thanks Andreas, I will definitely look into those as well.
From: Andreas Hansson [mailto:[email protected]] Sent: Wednesday, August 06, 2014 10:38 AM To: Brown, Martin; Default Subject: Re: Review Request 2320: sim: stopgap for race-conditions when using multiple EventQueues Hi Martin, Thanks for running the experiments. It sounds rather painful indeed. Concerning the atomic_int, I've actually managed to shift most use of the in-house RefCountingPtr to use (thread-safe) STL shared_ptr, and I've seen a performance drop of roughly 5% as a result of this change. I should be able to get the patches on the board in a not-too-distant future. Another observation when using the shared_ptr is that compiling with -march=native seem to be getting some of the loss back. It might be worth checking in your case as well. Andreas From: <Brown>, Martin <[email protected]<mailto:[email protected]>> Date: Wednesday, 6 August 2014 16:20 To: Andreas Hansson <[email protected]<mailto:[email protected]>>, Default <[email protected]<mailto:[email protected]>> Subject: RE: Review Request 2320: sim: stopgap for race-conditions when using multiple EventQueues Hi Andreas, Great question. Since I saw your question, I ran some small tests using Queens algorithm, 12x12 board. There is a performance impact for the single-threaded case. Here is the slowdown that I observed for the single-threaded case: 50% slowdown with this entire patch applied, ouch. 12% slowdown with all the locks applied, except for the std::atomic_int So the std::atomic_int is the biggest factor, but this should be fixed for all of the locks anyway. So I should have gem5 use the locks in this patch only when using multiple threads. I will look into this. Right now I'm considering using #ifdef for that. I am open to suggestions. Thanks! From: Andreas Hansson [mailto:[email protected]] On Behalf Of Andreas Hansson Sent: Monday, August 04, 2014 4:07 AM To: Andreas Hansson; Default; Brown, Martin Subject: Re: Review Request 2320: sim: stopgap for race-conditions when using multiple EventQueues This is an automatically generated e-mail. To reply, visit: http://reviews.gem5.org/r/2320/ Out of curiosity, is there any performance impact for the single-threaded case? - Andreas Hansson On August 1st, 2014, 7:04 p.m. UTC, Martin Brown wrote: Review request for Default. By Martin Brown. Updated Aug. 1, 2014, 7:04 p.m. Repository: gem5 Description Changeset 10264:c3977836244e --------------------------- sim: stopgap for race-conditions when using multiple EventQueues This patch fixes several race conditions that appear in multi- threaded mode. Currently the decode cache race condition is fixed only for x86, and in a temporary non-optimal fashion. We still need to decide on a more optimal solution for the decode cache and apply it to all the ISAs. Testing - Quick regression tests on x86, arm, alpha - Made sure that sparc, power, mips can be built with this patch - Tested using up to 28 EventQueues (28 threads) Diffs ? src/arch/x86/decoder.cc (c00b5ba43967e7e48a28b7ddc48c9f4afaf2ab76) ? src/base/refcnt.hh (c00b5ba43967e7e48a28b7ddc48c9f4afaf2ab76) ? src/base/trace.cc (c00b5ba43967e7e48a28b7ddc48c9f4afaf2ab76) ? src/sim/syscall_emul.cc (c00b5ba43967e7e48a28b7ddc48c9f4afaf2ab76) View Diff<http://reviews.gem5.org/r/2320/diff/> -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England & Wales, Company No: 2548782 _______________________________________________ gem5-dev mailing list [email protected] http://m5sim.org/mailman/listinfo/gem5-dev
