On Thu, Sep 8, 2011 at 12:21 PM, avadh patel <[email protected]> wrote:
> > On Thu, Sep 8, 2011 at 9:06 AM, DRAM Ninjas <[email protected]> wrote: > >> I think the short answer is -- all the easy stuff has already been done. A >> lot of the if statements have branch predictor hints, debug output is cut >> down to a minimum when building without debug mode, etc. >> >> Thats true. But many new code added to Marss is not profiled to find > bottlenecks, specially for multicore simulations. In SCons I have setup > compilation flags to use google-profile to profile simulations for memory > usage and performance. I did use it when we had issue with high memory > usage with checkpoints but I still need to find some time to get profile for > performance bottlenecks. If some one can pick this task to generate the > profile output and post it somewhere then we can try to optimize these > bottlenecks. > I've linked in the google perf tools and enabled the CPU profiler -- I'm running some 8 core runs with the profiler and core-models ... if it actually produces any useful information, I'll post the profiles somewhere. > > In optimized binary we don't disable ASSERT statements. Now Marss is more > stable, may be in next release I'll disable ASSERT for better performance. > > Any real speedup would come from trying to parallelize the code, but I >> doubt that will happen any time soon (if ever) -- the simulator is just too >> detailed and complex to be easily parallelizable. >> >> :D . Well I have a very very alpha level code that uses pthreads that > divides group of cores run on specific thread when you are using more than 8 > cores. There are some locking issues which slow down the performance > but preliminary results shows that if done right it will help us to simulate > large number of cores with decent speed. > > >> So I think as a community, we just have to accept the fact that detailed >> simulation is slow. >> >> I agree. But as community we can develop new designs that enables us to > take advantage of multicore systems or may be use some of these new > languages like Go or D to build next generation simulators. Its really > ironic that our research is focused on mutlicore systems and we can't use > those to improve our life. > > - Avadh > > On Wed, Sep 7, 2011 at 4:54 PM, sparsh1 mittal1 > <[email protected]>wrote: >> >>> Hello >>> Does anyone have suggestion regarding speeding-up marss ? I am sure, this >>> point will help others also. >>> >>> My friends who had used M5 told me, that in M5 the simulation speed >>> reduces almost linearly with number of cores. Given this, the speed of Marss >>> with multi-cores is already impressive! Yet, further speed-ups will help. >>> >>> Some general ideas are reducing print-outs, I/O. Yet, would you like to >>> share more specific and substantial speed-up ideas? For example, my main >>> interest is in cache related work. >>> >>> I would appreciate it. >>> >>> Thanks and Regards >>> Sparsh Mittal >>> >>> >>> >>> _______________________________________________ >>> http://www.marss86.org >>> Marss86-Devel mailing list >>> [email protected] >>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel >>> >>> >> >> _______________________________________________ >> http://www.marss86.org >> Marss86-Devel mailing list >> [email protected] >> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel >> >> >
_______________________________________________ http://www.marss86.org Marss86-Devel mailing list [email protected] https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
