Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

Nilay Vaish Wed, 12 Jan 2011 10:42:49 -0800

Hi Brad,

As I understand from your mail below, you do would like to write


ENTRY L1I_cache_entry = getL1ICacheEntry(in_msg.LineAddress);
trigger(Event, address, L1I_cache_entry);

instead of

set_cache_entry(getL1ICacheEntry(in_msg.LineAddress));
trigger(Event, address);

In other words, local variables should be used in inports, instead of__implicitly__ declared variables. And that these local variables will be__explicitly__ passed as arguments in the call to the trigger function.



Seems fair to me. Thanks for the clarification!

--
Nilay


On Tue, 11 Jan 2011, Beckmann, Brad wrote:

Hi Nilay,

Overall, I believe we are in more agreement with each other than maybeyou think. I'm glad you included pseudo code in your latest email.That is a great idea. I think part of our problem is we arecomprehending our textual descriptions in different ways.


Below are my  responses:


Absolutely, we still need the ability to allocate or deallocate
entries within actions.  I'm not advocating to completely eliminate
the set/unset cache and tbe entry functions.  I just want to avoid
including those calls in the inports.  I'm confused why the mandatory
queue is different than other queues.  They all trigger events in the same

way.

if (L1IcacheMemory.isTagPresent(in_msg.LineAddress)) {
    // The tag matches for the L1, so the L1 asks the L2 for it.
   trigger(mandatory_request_type_to_event(in_msg.Type),
in_msg.LineAddress); }

Brad, mandatory queue is just an example where an inport may perform tag
lookup before cache and transaction buffer entries has been set. Above is an
excerpt from the file MOESI_CMP_directory-L1cache.sm. Before the
trigger() is called, isTagPresent() is called. This means tag look up is being
performed before cache or transaction buffer entries have been set.
Suppose the tag was present in L1Icache, then in the trigger() call, we will
again perform lookup.

Similarly, there is an inport in the Hammer's protocol implementation where
getCacheEntry() is called before a call to trigger(). Now, why should we use
getCacheEntry() in the inport and cache entry in the action?

The reason is, as you pointed out, we ideally want to call getCacheEntryonce. I believe your suggestion to use local variables in the inputports gets us there. Below is what I'm envisioning for the MOESI_hammermandatory queue in_port logic (at least the IFETCH half of the logic):


ENTRY getL1ICacheEntry(Address addr) {
  assert(is_valid(L1DcacheMemory.lookup(addr) == FALSE);
  assert(is_valid(L2cacheMemory.lookup(addr) == FALSE);
  return L1IcacheMemory.lookup(addr);
}

ENTRY getL1DCacheEntry(Address addr) {
  assert(is_valid(L1IcacheMemory.lookup(addr) == FALSE);
  assert(is_valid(L2cacheMemory.lookup(addr) == FALSE);
  return L1DcacheMemory.lookup(addr);
}

ENTRY getL2CacheEntry(Address addr) {
  assert(is_valid(L1IcacheMemory.lookup(addr) == FALSE);
  assert(is_valid(L1DcacheMemory.lookup(addr) == FALSE);
  return L2cacheMemory.lookup(addr);
}

 in_port(mandatoryQueue_in, CacheMsg, mandatoryQueue, desc="...", rank=0) {
   if (mandatoryQueue_in.isReady()) {
     peek(mandatoryQueue_in, CacheMsg, block_on="LineAddress") {

       // Set the local entry variables
       ENTRY L1I_cache_entry = getL1ICacheEntry(in_msg.LineAddress);
       ENTRY L1D_cache_entry = getL1DCacheEntry(in_msg.LineAddress);
       TBE_Entry tbe_entry = getTBE(in_msg.LineAddress);

       // Check for data access to blocks in I-cache and ifetchs to blocks in 
D-cache

       if (in_msg.Type == CacheRequestType:IFETCH) {
         // ** INSTRUCTION ACCESS ***

         // Check to see if it is in the OTHER L1
         if (is_valid(L1D_cache_entry)) {
           // The block is in the wrong L1, try to write it to the L2
           if (L2cacheMemory.cacheAvail(in_msg.LineAddress)) {
             trigger(Event:L1_to_L2, in_msg.LineAddress, L1D_cache_entry, 
tbe_entry);
           } else {
             replace_addr = L2cacheMemory.cacheProbe(in_msg.LineAddress);
             replace_cache_entry = getL2CacheEntry(replace_addr);
             replace_tbe_entry = getTBE(replace_addr);
             trigger(Event:L2_Replacement, replace_addr, replace_cache_entry, 
replace_tbe_entry);
           }
         }

         if (is_valid(L1I_cache_entry)) {
           // The tag matches for the L1, so the L1 fetches the line.  We know 
it can't be in the L2 due to exclusion
           trigger(mandatory_request_type_to_event(in_msg.Type), 
in_msg.LineAddress, L1I_cache_entry, tbe_entry);
         } else {
           if (L1IcacheMemory.cacheAvail(in_msg.LineAddress)) {
             // L1 does't have the line, but we have space for it in the L1
             ENTRY L2_cache_entry = getL2CacheEntry(in_msg.LineAddress);
             if (is_valid(L2_cache_entry)) {
               // L2 has it (maybe not with the right permissions)
               trigger(Event:Trigger_L2_to_L1I, in_msg.LineAddress, 
L2_cache_entry, tbe_entry);
             } else {
               // We have room, the L2 doesn't have it, so the L1 fetches the 
line
               trigger(mandatory_request_type_to_event(in_msg.Type), 
in_msg.LineAddress, L1Icache_entry, tbe_entry);
        // you could also say here: 
"trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress, ODD, 
ODD);"
             }
           } else {
             // No room in the L1, so we need to make room
             if 
(L2cacheMemory.cacheAvail(L1IcacheMemory.cacheProbe(in_msg.LineAddress))) {
               // The L2 has room, so we move the line from the L1 to the L2
               replace_addr = L1IcacheMemory.cacheProbe(in_msg.LineAddress);
               replace_cache_entry = getL2CacheEntry(replace_addr);
               replace_tbe_entry = getTBE(replace_addr);
               trigger(Event:L1_to_L2, replace_addr, replace_cache_entry, 
replace_tbe_entry);
             } else {
               // The L2 does not have room, so we replace a line from the L2
               replace_addr = 
L2cacheMemory.cacheProbe(L1IcacheMemory.cacheProbe(in_msg.LineAddress));
               replace_cache_entry = getL2CacheEntry(replace_addr);
               replace_tbe_entry = getTBE(replace_addr);
               trigger(Event:L2_Replacement, replace_addr, replace_cache_entry, 
replace_tbe_entry);
             }
           }
         }
       } else {
         // *** DATA ACCESS ***
...
 }

    // The tag matches for the L1, so the L1 asks the L2 for it.
   trigger(mandatory_request_type_to_event(in_msg.Type),
in_msg.LineAddress); }

In MESI_CMP_directory, getState() is called from an inport. This means we
cannot have an implementation of getState() that makes use of cache entry
variable since it would not have been set. Now, different implementations
for setState() and getState() simply does not make sense to me, so in my
opinion setState() will also not use cache entry. These two function (I just
went through the profile output for MOESI hammer), account for about half
of the calls to isTagPresent(), the function that we are trying to get rid of.

I think we need to modify MESI_CMP_directories getState function to usethe cache_entry variable. It doesn't makes sense to me that it wouldn'tsince the state is part of the cache_entry. How else does getState andsetState access the cache state?

Maybe I should point out that I'm assuming that getCacheEntry can
return a NULL pointer and thus that can be passed into the trigger
call when no cache or tbe entry exists.


You are correct that getCacheEntry() would have to return a NULL, another
thing which I believe we earlier preferred avoiding. As an aside, we cannot
use the keyword NULL since it is already in use. So, we will have to rename
NULL as some thing else.

On second thought, I think it may not be necessary for getCacheEntry() to
use the keyword.

Yes, I initially was hoping to avoid the NULL (or the OOD) keyword, butover the past few weeks I have realized that isn't possible. By theway, if all things are equal, I would prefer to use your OOD keywordversus NULL. That way we differentiate from C++ NULL since thegetCacheEntry functions don't return pointers. At least I'm stillhoping to avoid that.

Another concern is in implementation of getCacheEntry(). If this
function has to return a pointer to a cache entry, we would have to
provide support for local variables which internally SLICC would assume to

be pointer variables.


Within SLICC understanding that certain variables are actually
pointers is a little bit of a nuisance, but there already exists
examples where we make that distinction.  For instance, look at the "if

para.pointer"

conditionals in StateMachine.py.  We just have to treat cache and tbe
entries in the same fashion.


I know it is possible to declare pointers in SLICC, CacheMemory being the
foremost example. But we only allow declaration of pointers. We tacitly
assume that when ever they will be used, they will only be used after being
dereferenced. Now, in case of getCacheEntry(), I do not see this happening.
Below is the current implementation of getCacheEntry() from
MOESI_CMP_directory-L1cache.sm.

Entry getCacheEntry(Address addr), return_by_ref="yes" {
   if (L1DcacheMemory.isTagPresent(addr)) {
     return static_cast(Entry, L1DcacheMemory[addr]);
   } else {
     return static_cast(Entry, L1IcacheMemory[addr]);
   }
}

I would probably convert it to some thing like this.

AbstractCacheEntry getCacheEntry(Address addr) {
   AbstractCacheEntry * cache_entry = L1DcacheMemory.lookup(addr);
   if(is_valid(cache_entry)) return cache_entry;
   return L1IcacheMemory.lookup(addr);
}

Now, if we assume that cache_entry will always be used in its dereferenced
form, then we will face problem in returning cache entry. We can introduce a
'return_by_pointer' annotation for the function to handle this situtation. But
you have pointed out in your previous emails that it is possible for
getCacheEntry() to do more things, which may cause problems for the
compiler.

I think we always assume cache_entry is dereference whenever one of itsmembers or member functions is called. However cache_entry itself is apointer and getCacheEntry always returns a pointer.

In my opinion, we should maintain one method for looking up cache

entries.

My own experience informs me that it is not difficult to incorporate
calls to set/unset_cache_entry () in already existing protocol

implementations.

For implementing new protocols, I think the greater obstacle will be
in implementing the protocol correctly and not in using entry
variables correctly. If we document this change lucidly, there is no
reason to believe a SLICC programmer will be exceptionally pushed

because of this change.


Assuming that this change does introduce some complexity in
progamming with SLICC, does that complexity out weigh the performance

improvements?


My position is we can leverage SLICC as an intermediate language and
achieve the performance benefits of your change without significantly
impacting the programmability.  I agree that we need the
set/unset_cache_entry calls in the allocate and deallocate actions.  I
see no problem with that.  I just want to treat these new implicit
cache and tbe entry variables like the existing implicit variable address.
Therefore I want to pass them into the trigger operation like the
address variable.  I also want just one method for looking up cache
entries.  I believe the only difference is that I would like to set
the cache and tbe entries in the trigger function, as well as allowing
them to be set in the actions.


I still don't think that we can avoid those calls in inports. By avoiding
those calls, the job of the compiler and the programmer will only become
more difficult. Currently I am of the opinion that if we avoid those
calls, we should avoid this change all together.

We definitely should not give up on this patch. You've put a lot ofhard work into it and, as your performance data has suggested, it shouldgive us a noticeable speedup. I sympathize with your position and Ireally appreciate the effort you've put it. If you'd like, I would behappy to take over the patch if you'd like.


Brad

_______________________________________________
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

Reply via email to