[Simflex] Re: Debugging message

lu peng Mon Nov 7 22:43:59 2005

An HTML attachment was scrubbed...
URL: 
http://sos.ece.cmu.edu/pipermail/simflex/attachments/20051108/28a2221b/attachment.html
From shanlu at cs.uiuc.edu  Tue Nov  8 00:46:55 2005
From: shanlu at cs.uiuc.edu (shan)
List-Post: [email protected]
Date: Tue Nov  8 00:47:40 2005
Subject: [Simflex] Re: Debugging message
In-Reply-To: <[email protected]>
Message-ID: <[email protected]>

Maybe I can help to answer some of the questions :-), after have bothered
Tom and this mail-list so much recently.

Q2: LOAD/STORE request means request issued by cpu to (L1)cache; read/write
requests are issued by cache to lower level cache. Fetch is instruction
fetch.

Q3: I added some DBG instruction in the file SimicsTracer.hpp, function
real_hier_operate, under the "Case 3" if (is_fetch), so that every fetched
instruction is shown. However, I guess Tom should have better solution.

Q4: I think L2 is shared. A MOSI based directory is kept in each L2 cache
line. Each L2 cache line records the current state of this cache line and
who is the owner, who is the sharer of this line.

That's all that I know. :-)

  _____  

From: [email protected] [mailto:[email protected]] On
Behalf Of lu peng
Sent: Monday, November 07, 2005 9:42 PM
To: [email protected]
Cc: [email protected]
Subject: [Simflex] Re: Debugging message

Hi Tom,

Thanks for your reply. I forgot to copy the generated .so file to the
simics's directory.   <http://graphics.hotmail.com/i.p.emcrook.gif> . Now
it's ok to show my debug messages. 

I am trying to understand the cache structure of Simflex. I read the ISCA 00
paper for Piranha protocol. Serveral questions:

1. It said they didn't have inclusion property. Does simflex have this
property in the CMPFlex model? 

2. I noticed that there are many types of requests in the debug info:
MemoryMessage[Write Request], MemoryMessage[Read Request],
MemoryMessage[Fetch Request], MemoryMessage[Load Request],
MemoryMessage[Store Request], etc. Are they just the synonyms of load/store
or have specific meanings? Seems Fetch request is for instruction cache
read.

3. I tried to trace some addresses. However, sometime I can't find which cpu
generated the requests at the begining. Only I can find that the first
appearance is from, e. g.,  (CacheController.hpp: 1045){290}-
sendBack_Request. Why? Did the cpu write a virtual address before put a
request into the request queue and it came back with a physical address?
Does {290} mean the number of the bus cycle?

4. Dose the CMFFlex implement a shared L2 cache? If so, I am trying to
implement a NUCA cache based on it. Do I need modify anything related
Piranha protocol? It seems only maintain the coherence for L1 caches. My
plan is to find the L2 read/write source code and let all L2 read/write
operations to search a directory, which maps the phycial address of a cache
block to its real position. Then read/write the mapped banks and get the
latency. Is this enough? Seems that it's not necessary to modify the source
code in PiranhaCacheControllerImpl::performOperation(). Am I correct?

5. Does Simflex support a pure private L2 cache scheme?

6. If you have more doc describing the source code, especially for the cache
structure, please let me read it.

Thanks a lot,

Lu

-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
http://sos.ece.cmu.edu/pipermail/simflex/attachments/20051107/0cc58fee/attachment.html
From twenisch at ece.cmu.edu  Wed Nov  9 18:31:56 2005
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Wed Nov  9 18:31:32 2005
Subject: [SimFlex]x86 tracer init bug? and other problems
In-Reply-To: <[email protected]>
References: <[email protected]>
Message-ID: <pine.lnx.4.53l-ece.cmu.edu.0511091536030.22...@dalmore.ece.cmu.edu>

Hi Shan,

Sorry for the slow reply - I have been busy preparing the Flexus 2.0
release (which should be out later today).

On Mon, 7 Nov 2005, shan wrote:

> Hi Tom,
>
>   I have several problems.
>
> Second, however, I still have that 'watchdog' assertion violation problem.
> If I disable that checking, my program can proceed and looks ok (till some
> point). I have checked the trace and I feel a little bit confused. My
> program, at that time, has 2 threads. Definitely, there would be 2 idle
> processors and they would definitely stall longer than the watchdog
> threshold. Is my understanding somehow incorrect?

We talked about this issue here.  You are right that given the way the
simulator currently works, you will always end up with watchdog timeouts
if one of your CPUs is idle.

I believe the best fix is to place a cap on the maximum number of stall
cycles Flexus will ever return to Simics.  The cap should be larger than
any real stall the memory system might produce.  This fix is fairly
accurate from the modeling point of view, and is a general solution to the
unbounded stall issue.

An alternative fix would be to create a special case for the HLT
instruction, and not accumulate stalls for HLTs.  Although this fix may be
more accurate, it is less general, as there may be other special cases
where Flexus can accumulate an unbounded number of stalls, and these cases
will invariably result in deadlocks.

To make the fix, add a test on the number of stall cycles in
SimicsTracer.hpp:trace_mem_hier_operate().  5000 cycles is probably a
decent cap - I can't think of any way the memory system could
legitimately stall an access for more than 5000 cycles.

I am putting this fix into our codebase, so you will be able to pick it up
in the release later today.

>
>
>
> Third, everytime I load flexus-CMPFlexus-x86... module, it succeeds at about
> one third possibility and the simics would get segment fault in other two
> third cases. I have checked the source file and it seems that there maybe a
> bug in SimicsTrace.hpp line 118~line125 (init function). In the code, in
> sparc version, an 'thePhysIO' object would first be read by
> SIM_get_attribute. However, in x86 version, it seems that thePhysIO is used
> without initialization. I am not sure if my understanding is correct. I have
> tried to include the whole part about thePhysIO into the sparc version, not
> handling PhysIO at all in x86 version. I am not sure if my modification is
> correct. After I did this, at least flex-CMPFlexus-x86 module loading will
> always succeed. But, I don't know if not handling PhysIO would cause other
> problem.
>

thePhysIO is not used for x86. I moved the rest of the code that uses it
into FLEXUS_TARGET_IS(v9) blocks, which should hopefully resolve the
seg-fault.

>
>
> Fourth, after comment the watchdog assertion. I tried my toy code. It can
> run for some while, however, later it will never terminate. (I did not make
> the modification described in Third issue in these experiments)
>
>    In one case, I find it has successfully go through all code, maybe right
> before the whole program's 'return 0'. But it just stalls there and the
> executor keeps issuing one exactly the same memory request. The memory
> request is replied by memory correctly, but the executor just keeps issuing
> the same memory request. (I attached the conf file and debug.out as
> debugnov7_joinedfinish.** for this).
>
>    In another case, it stops at some ealier stage (I am not sure exactly
> where in the source code). This time, the executor keeps issuing a fixed set
> of memory request in turn. Just like an infinite loop. (I attached the conf
> file and debug.out as debugnov7_oldx86deadloop.** for this too).
>

I am not sure about these infinite loops.  These could be the idle loop of
the OS after your application has finished.  I would take a look at the
PCs that Simics is feeding in during these loops, and see if they are OS
code.  You can probably modify some of the debug code in SimicsTracer to
print out the virtual PCs.

>
>
>   Tom, when you have time, can you help me to look at these? Is my PhysIO
> modification correct? what's wrong with that infinite loop?
>
>   Sorry to bring you trouble again. :-)
>
> thaks very much
>
> shan
>
>
From twenisch at ece.cmu.edu  Wed Nov  9 18:53:06 2005
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Wed Nov  9 18:52:39 2005
Subject: [Simflex] Re: Debugging message
In-Reply-To: <[email protected]>
References: <[email protected]>
Message-ID: <pine.lnx.4.53l-ece.cmu.edu.0511091832110.4...@dalmore.ece.cmu.edu>

Hi Lu,

On Tue, 8 Nov 2005, lu peng wrote:

>
> Hi Tom,
>
> Thanks for your reply. I forgot to copy the generated .so file to the 
> simics's directory. [i.p.emcrook.gif] . Now it's ok to show my debug messages.
>
> I am trying to understand the cache structure of Simflex. I read the ISCA 00 
> paper for Piranha protocol. Serveral questions:
>
> 1. It said they didn't have inclusion property. Does simflex have this 
> property in the CMPFlex model?

Just as in Piranha, the SimFlex CMP-cache does not maintain inclusion.
The uni-processor/DSM caches also do not maintain inclusion.

>
> 2. I noticed that there are many types of requests in the debug info: 
> MemoryMessage[Write Request], MemoryMessage[Read Request], 
> MemoryMessage[Fetch Request],
> MemoryMessage[Load Request], MemoryMessage[Store Request], etc. Are they just 
> the synonyms of load/store or have specific meanings? Seems Fetch request is 
> for
> instruction cache read.

As Shan said, Load and Store are issued by the processor, Read and Write
are issued by a cache.  There is also an Upgrade - A cache issues an
Upgrade (instead of a Write) if it has valid data for a line, but does
not have write permission.  Fetch indicates an instruction as you guessed.

>
> 3. I tried to trace some addresses. However, sometime I can't find which cpu 
> generated the requests at the begining. Only I can find that the first 
> appearance is
> from, e. g.,  (CacheController.hpp: 1045){290}- sendBack_Request. Why? Did 
> the cpu write a virtual address before put a request into the request queue 
> and it came
> back with a physical address? Does {290} mean the number of the bus cycle?

Some debug statements do not include  Comp(*this) or an equivalent to
cause the debug system to include the name/id of the component issuing the
request.  Debug statements that lack the component reference would need to
be individually fixed to print out their node id.  All references within
the memory heirarchy use physical addresses.  Numbers in {} in debug
output indicate the cycle number since the start of simulation.

>
> 4. Dose the CMFFlex implement a shared L2 cache? If so, I am trying to 
> implement a NUCA cache based on it. Do I need modify anything related Piranha 
> protocol? It
> seems only maintain the coherence for L1 caches. My plan is to find the L2 
> read/write source code and let all L2 read/write operations to search a 
> directory, which
> maps the phycial address of a cache block to its real position. Then 
> read/write the mapped banks and get the latency. Is this enough? Seems that 
> it's not necessary
> to modify the source code in PiranhaCacheControllerImpl::performOperation(). 
> Am I correct?

CMPFlex uses a shared L2.  I don't know if you need to change the Piranha
protocol to implement NUCA.  You may be able to modify calcDelay() in
CacheController.hpp to change the latency of cache requests based on their
type and the location you assign to them in the NUCA.  This is probably
the simplest change that will give you a pretty good model of a NUCA
cache.

>
> 5. Does Simflex support a pure private L2 cache scheme?

UniFlex ( and DSMFlex, which will be available in 2.0 ) use private L2
caches.

>
> 6. If you have more doc describing the source code, especially for the cache 
> structure, please let me read it.

The slides from the SimFlex tutorial (which we are giving in Barcelona at
MICRO on Saturday), will be available on the web later this evening.

>
> Thanks a lot,
>
> Lu
>

Regards,
-Tom Wenisch

>
>
>
>
From twenisch at ece.cmu.edu  Wed Nov  9 18:55:15 2005
From: twenisch at ece.cmu.edu (Thomas Wenisch)
List-Post: [email protected]
Date: Wed Nov  9 18:54:45 2005
Subject: [Simflex] Re: Debugging message
In-Reply-To: <[email protected]>
References: <[email protected]>
Message-ID: <pine.lnx.4.53l-ece.cmu.edu.0511091853570.4...@dalmore.ece.cmu.edu>

Hi Shan,

Thanks for replying to Lu's questions.  There more people we can get to
answer questions on the list, the faster the overall response time for
everyone will be.

Best Regards,
-Tom Wenisch

On Mon, 7 Nov 2005, shan wrote:

> Maybe I can help to answer some of the questions :-), after have bothered
> Tom and this mail-list so much recently.
>
From shanlu at cs.uiuc.edu  Wed Nov  9 21:18:03 2005
From: shanlu at cs.uiuc.edu (shan)
List-Post: [email protected]
Date: Wed Nov  9 22:17:34 2005
Subject: [Simflex] will speculative memory accesses be simulated?
In-Reply-To: <pine.lnx.4.53l-ece.cmu.edu.0511091853570.4...@dalmore.ece.cmu.edu>
Message-ID: <[email protected]>

Hi Tom,
  I have a question about branch misprediction simulation.
  How will the branch mis-prediction time be charged? 
  I guess at each branch instruction, we know whether this one will be
mis-predicted or not. If we know our prediction will be wrong, will the
simulation proceeds along the wrong direction and speculatively fetch data
from cache? 
  A related question is: will Simics gives flexus speculative executing
instructions? or the 'feeder' actually only gets always correct instructions
from simics?

Thanks
Shan

[Simflex] Re: Debugging message

Reply via email to