Hi Jason,

based on the slides from the 2006 ISCA tutorial, trace simulation is ~1.5  
MIPS, InOrder timing is 10 KIPS and OoO timing is 3KIPS (for whatever processor 
these measurements were taken on back then). I believe the bulk of the slowdown 
comes from the different modes Simics itself is running, but I don't have any 
evidence to support this claim. Unfortunately, none of us here has ever used 
the InOrder simulator.

Regards,
Evangelos

On Mar 29, 2013, at 5:04 PM, Jason Zebchuk wrote:

> Hi Evangelos,
> 
> There's a couple of reasons, but mostly it's because we want to see if we can 
> improve the time it takes to explore ideas by using long-running timing 
> simulations instead of the sampling methodology.  At the moment, we tend to 
> spend a lot of time working in functional simulation trying to see if 
> something has potential, then if we want to measure the performance impact, 
> we have to generate flexpoints and run timing simulation.  We've consistently 
> been frustrated by the need to develop models in the functional simulator and 
> then port the same model to the timing simulator. In addition, the time 
> required to generate the flexpoints also becomes a bit of a bottleneck, 
> especially for the new cloudsuite workloads.
> 
> So we've been thinking of using the in-order core so that long-running timing 
> simulations would hopefully run fast enough that we could use them for early 
> exploration of the performance potential of different ideas. The thought here 
> being that the order InOrder simulator would be significantly faster than 
> just putting the OoO simualtor into in-order mode. Do you have a rough 
> estimate of the kind of speedup you experience between in-order and 
> out-of-order using the OoO simulator?
> 
> 
> Thanks,
> 
> Jason
> 
> 
> 
> On 2013-03-29 9:26 AM, Evangelos Vlachos wrote:
>> Hi Jason,
>> 
>> is there a reason why you want to use the InOrder simulator? We discontinued 
>> it (at least) since the last release.  Even when I started using Flexus (6-7 
>> years ago) the older students were suggesting I would use the OoO simulator 
>> and configure it to model an InOrder core, just because the OoO codebase was 
>> getting more attention. I believe we have been doing that ever since. 
>> 
>> Regards,
>> Evangelos
>> 
>> On Mar 29, 2013, at 1:49 PM, Jason Zebchuk wrote:
>> 
>>> We're using timing with an inorder core (InorderSimicsFeeder, Execute, 
>>> IFetch, and BPWarm instead of uArch, FetchAddressGenerate, and uFetch, 
>>> etc.).
>>> 
>>> In the first case, we set it to stop after the first cycle and it actually 
>>> ran for about 165 cycles or so until the first instruction for each core 
>>> completed. We're simulating 16 cores with a scientific benchmark and most 
>>> of the cores tried to fetch the same instruction on the first cycle 
>>> resulting in a lot of queuing. I tracked the behavior in this case and it 
>>> issued 1 instruction for each core and completed just after every 
>>> instruction would have finished.
>>> 
>>> In the second case, it was set to terminate after 15k cycles.  Looking at 
>>> timestamps, that took a couple of minutes. The next 5k cycles took about 2 
>>> hours and it still hadn't stopped executing. Because it's so slow, I 
>>> haven't tried to track down whether there are any memory requests that are 
>>> delayed this long in the hierarchy or whether there's some other reason why 
>>> it's still executing. From my experience, it's pretty rare for a memory 
>>> request to take that long, especially considering that the in-order core 
>>> should cause less contention than an out-of-order core.
>>> 
>>> We did some debugging with gdb and it's definitely saving the statistics 
>>> every cycle, which is definitely create a huge slowdown.
>>> 
>>> It looks like it's getting stuck in the loop in 
>>> nInorderSimicsFeeder::SimicsCycleManager::advanceCycles()  in 
>>> components/InorderSimicsFeeder/CycleManager.hpp  I would expect that trying 
>>> to terminate the simulation should cause it to break out of this loop, but 
>>> it looks like that's not happening.
>>> 
>>> 
>>> Jason
>>> 
>>> 
>>> 
>>> On 2013-03-29 1:10 AM, Mahmood Naderan wrote:
>>>> Hi
>>>> 
>>>> >It tried to terminate after the first cycle, but it looks like it kept 
>>>> >executing for several cycles afterwards. It kept printing out the 
>>>> >following messages:
>>>> 
>>>> What is the end cycle? 1000?
>>>> 
>>>>  
>>>> >In one case, it executed 15k cycles very quickly, and then took a couple 
>>>> >of hours executing another 5k cycles and it still hadn't stopped the 
>>>> >simulation
>>>> 
>>>> Are you sure this behavior is the result of saving stats every cycle?
>>>> 
>>>> Are you using trace? Timing?
>>>> 
>>>> -- 
>>>> Regards, 
>>>> Mahmood
>>>> 
>>>> 
>>>> 
>>>> From: Jason Zebchuk <[email protected]>
>>>> To: "[email protected]" <[email protected]> 
>>>> Sent: Friday, March 29, 2013 5:11 AM
>>>> Subject: Inorder simulation not stopping gracefully
>>>> 
>>>>  Hi guys,
>>>> 
>>>> We tried running a simulation using the inorder core instead of the 
>>>> out-of-order core, and we ran into a little problem.
>>>> 
>>>> We did:
>>>> 
>>>> flexus.set "-magic-break:stop_cycle" "1"
>>>> 
>>>> to stop after a single cycle. It tried to terminate after the first cycle, 
>>>> but it looks like it kept executing for several cycles afterwards. It kept 
>>>> printing out the following messages:
>>>> 
>>>> <breakpoint_tracker.cpp:447> {1}- Reached target cycle. Ending simulation.
>>>> <flexus.cpp:717> {1}- Terminating simulation. Timestamp: 2013-Mar-28 
>>>> 20:02:51
>>>> <flexus.cpp:718> {1}- Saving final stats_db.
>>>> 
>>>> This was repeated over and over (with the cycle number incrementing by one 
>>>> each time) until the simulation eventually stopped.
>>>> 
>>>> It looks like it's waiting for outstanding memory requests to terminate 
>>>> before exiting the simulation. Is this the normal behavior with the 
>>>> in-order core?
>>>> 
>>>> The real problem is that each cycle it tries to save the statistics.  When 
>>>> we try running longer simulations, the statistics get rather large so it 
>>>> advances very slowly. We also saw cases where it would continue running 
>>>> for several hours after it should have terminated. In one case, it 
>>>> executed 15k cycles very quickly, and then took a couple of hours 
>>>> executing another 5k cycles and it still hadn't stopped the simulation.  
>>>> I'm not sure if this is an issue with the memory hierarchy taking a long 
>>>> time to complete all of the outstanding requests, or if there's some other 
>>>> bug in this case.
>>>> 
>>>> Any thoughts you might have would be useful.
>>>> 
>>>> 
>>>> Thanks,
>>>> 
>>>> Jason
>>> 
>> 
> 

Reply via email to