On Thu, Sep 8, 2011 at 12:21 PM, avadh patel <[email protected]> wrote:

>
> On Thu, Sep 8, 2011 at 9:06 AM, DRAM Ninjas <[email protected]> wrote:
>
>> I think the short answer is -- all the easy stuff has already been done. A
>> lot of the if statements have branch predictor hints, debug output is cut
>> down to a minimum when building without debug mode, etc.
>>
>> Thats true. But many new code added to Marss is not profiled to find
> bottlenecks, specially for multicore simulations.  In SCons I have setup
> compilation flags to use google-profile to profile simulations for memory
> usage and performance.  I did use it when we had issue with high memory
> usage with checkpoints but I still need to find some time to get profile for
> performance bottlenecks. If some one can pick this task to generate the
> profile output and post it somewhere then we can try to optimize these
> bottlenecks.
>

I've linked in the google perf tools and enabled the CPU profiler -- I'm
running some 8 core runs with the profiler and core-models ... if it
actually produces any useful information, I'll post the profiles somewhere.


>
> In optimized binary we don't disable ASSERT statements. Now Marss is more
> stable, may be in next release I'll disable ASSERT for better performance.
>
>  Any real speedup would come from trying to parallelize the code, but I
>> doubt that will happen any time soon (if ever) -- the simulator is just too
>> detailed and complex to be easily parallelizable.
>>
>> :D . Well I have a very very alpha level code that uses pthreads that
> divides group of cores run on specific thread when you are using more than 8
> cores.  There are some locking issues which slow down the performance
> but preliminary results shows that if done right it will help us to simulate
> large number of cores with decent speed.
>
>
>> So I think as a community, we just have to accept the fact that detailed
>> simulation is slow.
>>
>> I agree. But as community we can develop new designs that enables us to
> take advantage of multicore systems or may be use some of these new
> languages like Go or D to build next generation simulators. Its really
> ironic that our research is focused on mutlicore systems and we can't use
> those to improve our life.
>
> - Avadh
>
> On Wed, Sep 7, 2011 at 4:54 PM, sparsh1 mittal1 
> <[email protected]>wrote:
>>
>>> Hello
>>> Does anyone have suggestion regarding speeding-up marss ? I am sure, this
>>> point will help others also.
>>>
>>> My friends who had used M5 told me, that in M5 the simulation speed
>>> reduces almost linearly with number of cores. Given this, the speed of Marss
>>> with multi-cores is already impressive! Yet, further speed-ups will help.
>>>
>>> Some general ideas are reducing print-outs, I/O. Yet, would you like to
>>> share more specific and substantial speed-up ideas? For example, my main
>>> interest is in cache related work.
>>>
>>> I would appreciate it.
>>>
>>> Thanks and Regards
>>> Sparsh Mittal
>>>
>>>
>>>
>>> _______________________________________________
>>> http://www.marss86.org
>>> Marss86-Devel mailing list
>>> [email protected]
>>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>>>
>>>
>>
>> _______________________________________________
>> http://www.marss86.org
>> Marss86-Devel mailing list
>> [email protected]
>> https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel
>>
>>
>
_______________________________________________
http://www.marss86.org
Marss86-Devel mailing list
[email protected]
https://www.cs.binghamton.edu/mailman/listinfo/marss86-devel

Reply via email to