Actually the problem, it turned out, is with the o3 cpu. The fetch stage, when it overtakes from another cpu, initializes its status variable to Running. When an o3 cpu is about to switch to another cpu, the fetch stage checks that its status should be Idle. Now suppose there are two processors in the system. The operating system has just started and it is running on cpu0. Then, cpu1 would not be actually doing anything. When trying to switch to another cpu, cpu1 gets stuck because there is nothing going on that will move it from Running to Idle. Making the following change allows the simulation to proceed smoothly.

diff -r 8873fd449657 -r 02285e0961a1 src/cpu/o3/fetch_impl.hh
--- a/src/cpu/o3/fetch_impl.hh  Wed Jun 05 22:34:02 2013 -0500
+++ b/src/cpu/o3/fetch_impl.hh  Wed Jun 05 23:32:09 2013 -0500
@@ -317,7 +317,8 @@

     // Setup PC and nextPC with initial state.
     for (ThreadID tid = 0; tid < numThreads; tid++) {
-        fetchStatus[tid] = Running;
+        //fetchStatus[tid] = Running;
+        fetchStatus[tid] = Idle;
         pc[tid] = cpu->pcState(tid);
         fetchOffset[tid] = 0;
         macroop[tid] = NULL;

--
Nilay

On Thu, 6 Jun 2013, Andreas Hansson wrote:

Hi Nilay,

I'll have a look at the DRAM controller. As you mention, the switching
regressions do not seem to have any issues. Did you get any further in
isolating the problem?

Thanks,

Andreas

On 06/06/2013 05:01, "Nilay Vaish" <[email protected]> wrote:

No, this is not the bug we are looking for. I just figured how the drain
functionality works.

--
Nilay

On Wed, 5 Jun 2013, Joel Hestness wrote:

Hey Nilay,
 I haven't experienced this bug, but it may be the same one that Mahshid
(cc'd: [email protected]) is experiencing with her trouble
restoring
a checkpoint with a large-scale CMP (in this thread:
http://www.mail-archive.com/[email protected]/msg07639.html).  If she
happens to still have that checkpoint, she might be able to test whether
her system sees the same draining bug.

 Joel


On Wed, Jun 5, 2013 at 5:51 PM, Nilay Vaish <[email protected]> wrote:

I am trying to debug some issue with switching of cpus in a
multiprocessor
system. While trying to drain the system, there is a call to
simulate() ins
function _drain() that seemingly gets stuck. To me it seems that it
should
return when the main event queue becomes empty. But simple DRAM keeps
on
posting its refresh event. It seems that once the DRAM has moved to
drained
/ draining state, it should not post the refresh event.

Can some one confirm that this is a bug in Simple DRAM? Given that
there
are switch cpu regressions that we run every week, it is hard to
believe
that Simple DRAM has a bug in its drain functionality. But I have not
been
able to come up with an alternate explanation either.

Thanks
Nilay
______________________________**_________________
gem5-dev mailing list
[email protected]

http://m5sim.org/mailman/**listinfo/gem5-dev<http://m5sim.org/mailman/li
stinfo/gem5-dev>




--
 Joel Hestness
 PhD Student, Computer Architecture
 Dept. of Computer Science, University of Wisconsin - Madison
 http://pages.cs.wisc.edu/~hestness/
_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev



-- IMPORTANT NOTICE: The contents of this email and any attachments are 
confidential and may also be privileged. If you are not the intended recipient, 
please notify the sender immediately and do not disclose the contents to any 
other person, use it for any purpose, or store or copy the information in any 
medium.  Thank you.

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

_______________________________________________
gem5-dev mailing list
[email protected]
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to