I think there are two situations where this assertion can occur.

1. if your functional model has a cache, then it is possible that you have initiated a switchOut command at a point that your func cpu (or one of your func cpus in case you are running multi-proc) has issued a load/store and missed in the data cache. SimpleCPU changes the state of the cpu to DcacheMissStall and sets the memReg's completionEvent to a new event (CacheCompletionEvent) that should handle this properly.
here's where i think things go wrong....
if you look in SimpleCPU::read(), you'll see that the memReq->completionEvent is not set to &cacheCompletionEvent (the event to properly handle the cpu status) until after dcacheInterface->access() is called. now, this depends on the cache's implementation of access(), but if the implementation schedules the memReq->completionEvent, then your special event will never be called. on the other hande, if your cache has a queue for outstanding memReq, that get serviced some simulation tick later, then it's okay. now, the potential second problem (i think) is in the beginning of the SimpleCPU::read() call, where if the status() is DcacheMissStall or DcacheMissSwitch, it just performs the read from functional memory without checking cache stuff. i'm not sure why, but i would think that after doing this, the cpu's _status flag should be set to Running, or at least it shouldn't be DcacheMissStall or DcacheMissSwitch anymore. anyways... after adding a line (_status = Running; ) in that branch statement, it seems like i don't have this problem anymore.

2. if you are switching back and forth between func and detailed cpu models. (this can't be done with the current release of M5, but it is do-able... i know cause i have it do-ing this :), then it's possible that when you switched out of functional mode, the status of the CPU was a DcacheMissStall and was changed to DcacheMissSwitch. (if you fixed problem #1), this shouldn't be a problem anymore. now when you switch back from detailed to func, the status of the func cpu will still have the DcacheMissSwitch and the first tick() of the cpu will fire that assertion.

hope this helps a little ...

Lisa Hsu wrote:
Speaking of misleading, root.pc_sample_interval is a totally orthogonal thing. that is a sampling interval where the PC is grabbed in order to determine what function you are in, for the purposes of determining how much time is spent in each function. it's still sampling, but a totally different sampling.

the reason we name the cpu Sampler is because we hope to eventually "sample" statistics in different periods of an application (i.e. every 1e9 cycles take 200e6 cycles of statistics).

and since the default frequency of the system is 2GHz, then yes, 2,000,000 cycles is 1ms. you can change the frequency of the processor with the FREQUENCY environment variable.

how long a sim takes depends a lot on whether you are running simple CPU, detailed CPU, full system or not, and what kind of system you are on. but running 1e9 in CPUCache to 200e6 on DetailedCPU for a network benchmark on a 3+GHz P4 takes around 3 hours or so, I believe. I don't have more specific numbers on how long detailed simulation takes, but it is pretty slow.

Lisa

On 4/4/06, *Edmond Coté* <[EMAIL PROTECTED] <mailto:[EMAIL PROTECTED]>> wrote:

    Thanks,

    I think I got a hold of how SAMPLER works, despite the name itself is
    being misleading.

    A potential follow up question would be, what exactly does
    root.pc_sample_interval do?

    Also, what kind of simulation times can be expected for the
    detailed CPU
    model. I'm looking at ~30s for 2,000,000 ticks, does that make sense
    (Athlon X2 4800+)?, my guess is that this represents approximately 1ms
    of actual time at 2GHz. Is this assumption valid?

    Edmond

    Lisa Hsu wrote:
    > Hi Edmond,
    >
    > I'm not sure about the m5 switchcpu command - i've never used it
    before,
    > hopefully someone else knows about that.  but the periods that you
    > mention represent how long, in cycles i believe, to run on one
    cpu, and
    > then how long to go on the next before exiting.  we currently do not
    > have the capability to switch back and forth, we can only go
    through the
    > sequence of cpus one time.
    >
    > if you want to change these defaults, just change the environemnet
    > variable WARMUP_PERIOD and RUN_PERIOD, the same way you set
    SYSTEM and
    > MEMORY.
    >
    > finally, the CacheCPU is a construct that only exists in the python
    > configuration code (that is why Kevin didn't know what it was),
    and it's
    > just a simple CPU attached to a blocking cache hierarchy.
    >
    > The default of our sampler is to start at a cache CPU (to warm the
    > caches) for 1e9 cycles and go to a detailed CPU (to take data)
    for 200e6
    > cycles.  we also often start from a checkpoint, which we
    generated using
    > only a simpleCPU so that we can skip a lot of the stuff that
    happens in
    > the beginning of a program.
    >
    > good luck.
    > lisa
    >
    >
    > On 3/31/06, *Edmond Coté* <[EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>
    > <mailto:[EMAIL PROTECTED]
    <mailto:[EMAIL PROTECTED]>>> wrote:
    >
    >     Hello,
    >
    >     I'm currently attempting to perform a full-system simulation
    using M5's
    >     Sampler feature. I receive the following output after
    issuing "m5
    >     switchcpu" at the command prompt:
    >
    >     m5.opt: m5/cpu/simple/cpu.cc:806: void SimpleCPU::tick():
    Assertion
    >     `status() == Running || status() == Idle || status() ==
    DcacheMissStall'
    >     failed.
    >     Program aborted at cycle 3967324804
    >
    >     The simulator is launched within m5/configs/fullsys using:
    >
    >     SYSTEM=Sampler MEMORY=STX NUMCPUS=4 [...] m5.opt run.py
    >
    >     Next, I'm not entirely clear about how the period parameter
    (periods =
    >     [1e10, 200e6]) works. Am I correct to assume that the first
    value (in
    >     time/ticks/??) represents the delay until the first switch
    to detailed
    >     mode takes place? If so, how can I obtain a more accurate
    value? For
    >     example, when set at the default value, 1e9, the simulator
    essentially
    >     stalls before the CLI appears.
    >
    >     Finally, can anyone comment on the differences between
    CacheCPU and
    >     DetailedCPU?
    >
    >     Thanks, your help is much appreciated.
    >
    >     Edmond
    >
    >
    >
    >
    >     -------------------------------------------------------
    >     This SF.Net email is sponsored by xPML, a groundbreaking
    scripting
    >     language
    >     that extends applications into web and mobile media. Attend
    the live
    >     webcast
    >     and join the prime developer group breaking into this new
    coding
    >     territory!
> http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
    <http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642>
> <http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642
    <http://sel.as-us.falkag.net/sel?cmd=lnk&kid=110944&bid=241720&dat=121642>>
    >     _______________________________________________
    >     m5sim-users mailing list
    >     [email protected]
    <mailto:[email protected]>
    >     <mailto:[email protected]
    <mailto:[email protected]>>
    >     https://lists.sourceforge.net/lists/listinfo/m5sim-users
    >
    >




-------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
m5sim-users mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/m5sim-users

Reply via email to