Re: [m5-dev] Ruby random tester failing with MESI_CMP_directory?

2011-03-24 Thread Arkaprava Basu

Yeah I will talk to Nilay. Its bizarre :(

Thanks
Arka

On 03/23/2011 05:02 PM, Lisa Hsu wrote:

Yes, that's bizarre, since I was using the same tip, no patches applied.
  Don't know what to tell you Arka...since you and Nilay are both in
Wisconsin maybe you can look at the differences in your setups, because so
far it seems like it's just you :P.

Lisa

On Wed, Mar 23, 2011 at 1:55 PM, Nilay Vaishni...@cs.wisc.edu  wrote:


It is working fine for me.

  ./build/X86_SE_MESI_CMP_directory/m5.fast
./configs/example/ruby_random_test.py -n 4 -l 1

M5 Simulator System

Copyright (c) 2001-2008
The Regents of The University of Michigan
All Rights Reserved


M5 compiled Mar 23 2011 15:53:38
M5 started Mar 23 2011 15:53:52
M5 executing on mumble-09.cs.wisc.edu
command line: ./build/X86_SE_MESI_CMP_directory/m5.fast
./configs/example/ruby_random_test.py -n 4 -l 1

Global frequency set at 10 ticks per second
info: Entering event queue @ 0.  Starting simulation...
hack: be nice to actually delete the event here
Exiting @ tick 14536941 because Ruby Tester completed




On Wed, 23 Mar 2011, Nilay Vaish wrote:

  I will try to bisect this.


--
Nilay

On Wed, 23 Mar 2011, Arkaprava Basu wrote:

  Hi Lisa and Nilay,

   Thanks for the response. Following is the tip of my repo

changeset:   8174:e21f6e70169e
tag: tip
user:Nilay Vaishni...@cs.wisc.edu
date:Tue Mar 22 06:41:54 2011 -0500
summary: Ruby: Remove CacheMsg class from SLICC

So this is after Nilay's patch for CacheMsg. And yes it did not tun for
10-15 mins, died immediately. The architecture should not matter for
random_tester. I am not sure then why its breaking. Seems like something
broken.

Thanks
Arka





  ___

m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


[m5-dev] Ruby random tester failing with MESI_CMP_directory?

2011-03-22 Thread Arkaprava Basu

Hi,

 I just updated a clean gem5 repo, compiled MESI_CMP_directory and 
tried to run ruby random tester but it immediately failed as follows. 
Can any body reproduce this?


Thanks
Arka


build/X86_SE_MESI_CMP_directory/m5.debug 
configs/example/ruby_random_test.py -l 10 -n 4

M5 Simulator System

Copyright (c) 2001-2008
The Regents of The University of Michigan
All Rights Reserved


M5 compiled Mar 22 2011 17:56:26
M5 started Mar 22 2011 17:58:16
M5 executing on rockstar.cs.wisc.edu
command line: build/X86_SE_MESI_CMP_directory/m5.debug 
configs/example/ruby_random_test.py -l 10 -n 4

Global frequency set at 10 ticks per second
info: Entering event queue @ 0.  Starting simulation...
fatal: Invalid transition
system.dir_cntrl0 time: 1125 addr: [0x400, line 0x400] event: Fetch state: M
 @ cycle 1125
[doTransitionWorker:build/X86_SE_MESI_CMP_directory/mem/protocol/Directory_Transitions.cc, 
line 234]

Memory Usage: 297516 KBytes
For more information see: http://www.m5sim.org/fatal/23f196b2

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-25 Thread Arkaprava Basu
In sum, I think we all agree that Ruby is going to handle *only 
non-speculative stores*.  M5 CPU model(s) handles all of speculative and 
non-speculative stores that are *yet to be revealed to the memory 
sub-system*.


To make it clearer, as I understand,  we now have following:

1. All store buffering (speculative and non-speculative) is handled by 
CPU model in M5.
2. Ruby needs to forward intervention/invalidation received at L1 cache 
controller to the CPU model to let it take appropriate action to 
guarantee required memory consistency guarantees (e.g t may need to 
flush pipeline).

OR
CPU models need to check coherence permission at L1 cache at the 
commit time to know if intervening writes has happened or not (might be 
required to implement stricter model like SC).


I think we need to provide one of the functionality from Ruby side to 
allow the second condition above. Which one to provide depends upon what 
M5 CPU models wants to do to guarantee consistency.


Please let me know if you disagree or if I am missing something.

Thanks
Arka




On 02/24/2011 05:22 PM, Beckmann, Brad wrote:

So I think Steve and I are in agreement here.  We both agree that both 
speculative and non-speculative store buffers should be on the CPU side of the 
RubyPort interface.  I believe that was the same line that existed when Ruby 
tied to Opal in GEMS.  I believe the non-speculative store buffer was only a 
feature used when Opal was not attached, and it was just the simple 
SimicsProcessor driving Ruby.

The sequencer is a separate issue.  Certain functionality of the sequencer can 
probably be eliminated in gem5, but I think other functionality needs to remain 
or at least be moved to some other part of Ruby.  The sequencer performs a lot 
of protocol independent functionality including: updating the actual data 
block, performing synchronization with respect to the cache memory, translating 
m5 packets to ruby requests, checking for per-cacheblock deadlock, and 
coalescing requests to the same cache block.  The coalescing functionality can 
probably be eliminated, but I think the other functionality needs to remain.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Steve Reinhardt
Sent: Thursday, February 24, 2011 1:52 PM
To: M5 Developer List
Subject: Re: [m5-dev] Store Buffer


On Thu, Feb 24, 2011 at 1:32 PM, Nilay 
Vaishni...@cs.wisc.edumailto:ni...@cs.wisc.edu  wrote:
On Thu, 24 Feb 2011, Beckmann, Brad wrote:
Steve, I think we are in agreement here and we may just be disagreeing with the 
definition of speculative.  From the Ruby perspective, I don't think it really 
matters...I don't think there is difference between a speculative store address 
request and a prefetch-with-write-intent. Also we agree that probes will need 
to be sent to O3 LSQ to support the consistency model.
My point is that if we believe this functionality is required, what is the 
extra overhead of adding a non-speculative store buffer to the O3 model as 
well?  I think that will be easier than trying to incorporate the current Ruby 
non-speculative store buffer into each protocol.

I don't know the O3 LSQ model very well, but I assume it buffers both 
speculative and non-speculative stores.  Are there two different structures in 
Ruby for that?

I think the general issue here is that the dividing line between processor and memory 
system is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request 
filtering, etc. all happens in the processor.  For example, I know I've had you explain this to 
me multiple times already, but I still don't understand why we still need Ruby sequencers either :-).

Brad, I raise the same point that Arka raised earlier. Other processor models 
can also make use of store buffer. So, why only O3 should have a store buffer?

Nilay, I think that's a different issue... we're not saying that other CPU 
models can't have store buffers, but in practice, the simple CPU models block 
on memory accesses so they don't need one.  If the inorder model wants to add a 
store buffer (if it doesn't already have one), it would be an internal decision 
for them whether they want to write one from scratch or try to reuse the O3 
code.  There are already some shared structures in src/cpu like branch 
predictors that can be reused across CPU models.

So in other words we need to decide first where the store buffer should live 
(CPU or memory system) and then we can worry about how to reuse that code if 
that's useful.
Steve



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-23 Thread Arkaprava Basu

Hi Brad,

  I have very little knowledge about the store buffer 
implementation in libruby and o3 CPU model. But I have following high 
level question:


Is this Store buffer in libruby only for keeping retired 
(non-speculative) stores ? If yes, then why a particular CPU models 
matters here? In-order cores (CPU models) can also use (retired) store 
buffer. I believe its is an issue of memory consistency model being 
simulated rather than tied to a particular CPU model 
(in-oder/out-of-order).  Please correct me if I am totally missing 
something here.


Thanks
Arka

On 02/23/2011 07:07 PM, Beckmann, Brad wrote:

That's a good question.  Before we get rid of it, we should decide what is the 
interface between Ruby and the o3 LSQ.  I don't know how the current o3 LSQ 
works, but I image that we need to pass probe requests through the RubyPort to 
make it work correctly.

Does anyone with knowledge of the o3 LSQ have a suggestion?

Brad



-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
On Behalf Of Nilay Vaish
Sent: Wednesday, February 23, 2011 4:51 PM
To: m5-dev@m5sim.org
Subject: [m5-dev] Store Buffer

Brad,

In case we remove libruby, what becomes of the store buffer? In fact, is
store buffer in use?

Thanks
Nilay
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-23 Thread Arkaprava Basu
Fundamentally, I wish to handle only non-speculative memory state within 
Ruby. Otherwise I think there might be risk of Ruby getting affected by 
the CPU model's behavior/nuances. As you suggested, Rubyport may well be 
the line dividing speculative and non-speculative state.


I haven't looked at the  Store buffer code in libruby and do not know 
how it interfaces with the protocols. So sorry, I don't have specific 
answers to your questions. I think Derek is the best person to comment 
on this as I believe he has used store buffer implementation for his 
prior research.


I do think though, that the highest level (closest to the processor) 
cache controller (i.e. *-L1Cache.sm ) need to be made aware of the store 
buffer (unless it is hacked to bypass SLICC) .


 Thanks
Arka

On 02/23/2011 11:29 PM, Beckmann, Brad wrote:

Sorry, I should have been more clear.  It fundamentally comes down to how does 
the Ruby interface help support memory consistency, especially considering more 
realistic buffering between the CPU and memory system (both speculative and 
non-speculative).  I'm pretty certain that Ruby and the RubyPort interface will 
need be changed.  I just want us to fully understand the issues before making 
any changes or removing certain options.  So are you advocating that the 
RubyPort interface be the line between speculative memory state and 
non-speculative memory state?

As far as the current Ruby store buffer goes, how does it work with the L1 
cache controller?  For instance, if the L1 cache receives a probe/forwarded 
request to a block that exists in the non-speculative store buffer, what is the 
mechanism to retrieve the up-to-date data from the buffer entry?  Is the 
mechanism protocol agnostic?

Brad



-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
Behalf Of Arkaprava Basu
Sent: Wednesday, February 23, 2011 6:10 PM
To: M5 Developer List
Subject: Re: [m5-dev] Store Buffer

Hi Brad,

I have very little knowledge about the store buffer
implementation in libruby and o3 CPU model. But I have following high
level question:

Is this Store buffer in libruby only for keeping retired
(non-speculative) stores ? If yes, then why a particular CPU models
matters here? In-order cores (CPU models) can also use (retired) store
buffer. I believe its is an issue of memory consistency model being
simulated rather than tied to a particular CPU model
(in-oder/out-of-order).  Please correct me if I am totally missing
something here.

Thanks
Arka

On 02/23/2011 07:07 PM, Beckmann, Brad wrote:

That's a good question.  Before we get rid of it, we should decide

what is the interface between Ruby and the o3 LSQ.  I don't know how
the current o3 LSQ works, but I image that we need to pass probe
requests through the RubyPort to make it work correctly.

Does anyone with knowledge of the o3 LSQ have a suggestion?

Brad



-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
On Behalf Of Nilay Vaish
Sent: Wednesday, February 23, 2011 4:51 PM
To: m5-dev@m5sim.org
Subject: [m5-dev] Store Buffer

Brad,

In case we remove libruby, what becomes of the store buffer? In

fact, is

store buffer in use?

Thanks
Nilay
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol

2011-02-07 Thread Arkaprava Basu
: http://repo.m5sim.org/m5?cmd=changeset;node=8f37a23e02d7
description:
Ruby: Fixes MESI CMP directory protocol
The current implementation of MESI CMP directory protocol is

broken.

This patch, from Arkaprava Basu, fixes the protocol.

diffstat:



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: ruby: support to stallAndWait the mandatory queue

2011-01-22 Thread Arkaprava Basu

Hi Nilay,

You are mostly correct. I believe this patch contains two things

1. Support in SLICC to allow waiting and stalling on messages in message 
buffer when the directory is in blocking state for that address (i.e. 
can not process the message at this point),  until some event occurred 
that can make consumption of the message possible. When the directory 
unblocks, it provides the support for waking up the messages that were 
hitherto waiting (this is the precise reason why u did not see pop of 
mandatory queue, but see WakeUpAllDependants).


2. It contains changes to MOESI_hammer protocol that leverages this support.

For the purpose of this particular discussion, the 1st part is the 
relevant one.


As far as I understand, the support in SLICC for waiting and stalling 
was introduced primarily to enhance fairness in the way SLICC handles 
the coherence requests. Without this support when a message arrives to a 
controller in blocking state, it recycles, which means it polls again 
(and thus looks up again) in 10 cycles (generally recycle latency is set 
to 10). If there are multiple messages arrive while the controller was 
blocking state for a given address, you can easily see that there is NO 
fairness. A message that arrived latest for the blocking address can 
be served first when the controller unblocks. With the new support for 
stalling and waiting, the blocked messages are put in a FIFO queue and 
thus providing better fairness.
But as you have correctly guessed, another major advantage of this 
support is that it reduces unnecessary lookups to the cache structure 
that happens due to polling (a.k.a recycle).  So in summary, I believe 
that the problem you are seeing with too many lookups will *reduce* when 
the protocols are adjusted to take advantage of this facility. On 
related note, I should also mention that another fringe benefit of this 
support is that it helps in debugging coherence protocols. With this, 
coherence protocol traces won't contains thousands of debug messages for 
recycling, which can be pretty annoying for the protocol writers.


I hope this helps,

Thanks
Arka



On 01/22/2011 06:40 AM, Nilay Vaish wrote:

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/408/#review797
---


I was thinking about why the ratio of number of memory lookups, as reported by 
gprof,
and the number of memory references, as reported in stats.txt.

While I was working with the MESI CMP directory protocol, I had seen that the 
same
request from the processor is looked up again and again in the cache, if the 
request
is waiting for some event to happen. For example, suppose a processor asks for 
loading
address A, but the cache has no space for holding address A. Then, it will give 
up
some cache block B before it can bring in address A.

The problem is that while the cache block B is being given, it is possible that 
the
request made for address A is looked up in the cache again, even though we know 
it
is not possible that we would find it in the cache. This is because the 
requests in
the mandatory queue are recycled till they get done with.

Clearly, we should remove the request for bringing in address A to a separate 
structure,
instead of looking it up again and again. The new structure should be looked up 
whenever
an event, that could possibly affect the status of this request, occurs. If we 
do this,
then I think we should see a further reduction in the number of lookups. I 
would expect
almost 90% of the lookups to the cache to go away. This should also mean a 5% 
improvement
in simulator performance.

Brad, do agree with the above reasoning? If I am reading the patch correctly, I 
think
this patch is trying to do that, though I do not see the mandatory queue being 
popped.
Can you explain the purpose of the patch in a slightly verbose manner? If it is 
doing
doing what I said above, then I think we should do this for all the protocols.

- Nilay


On 2011-01-06 16:19:46, Brad Beckmann wrote:

---
This is an automatically generated e-mail. To reply, visit:
http://reviews.m5sim.org/r/408/
---

(Updated 2011-01-06 16:19:46)


Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan 
Binkert.


Summary
---

ruby: support to stallAndWait the mandatory queue

By stalling and waiting the mandatory queue instead of recycling it, one can
ensure that no incoming messages are starved when the mandatory queue puts
signficant of pressure on the L1 cache controller (i.e. the ruby memtester).


Diffs
-

   src/mem/protocol/MOESI_CMP_token-L1cache.sm 9f9e10967912
   src/mem/protocol/MOESI_hammer-cache.sm 9f9e10967912
   src/mem/ruby/buffers/MessageBuffer.hh 9f9e10967912
   

Re: [m5-dev] (no subject)

2011-01-18 Thread Arkaprava Basu
I think there are different topology file for different layouts and thus 
allowing different number of controllers. For example, topology named 
MeshDirCorners would allow a configuration with --num-cpus 16 
--num-l2caches 16 --num-dirs 4 . This essentially places the MCs (a.k.a 
dirs) at the corner of the chip.


Similarly, if disproportionate number of L2 controllers are need then 
MeshClustered or MeshClusteredDirCorners topologies need to be used.



Thanks
Arka

On 01/18/2011 11:28 PM, Nilay wrote:

Brad,

I got the simulation working. It seems to me that you wrote Mesh.py under
the assumption that number of cpus = number of L1 controllers = number of
L2 controllers (if present) = number of directory controllers.

The following options worked after some struggle and some help from Arka -

./build/ALPHA_FS_MESI_CMP_directory/m5.fast ./configs/example/ruby_fs.py
--maxtick 20 -n 16 --topology Mesh --mesh-rows 4 --num-dirs 16
--num-l2caches 16

--
Nilay


On Tue, January 18, 2011 10:28 am, Beckmann, Brad wrote:

Hi Nilay,

My plan is to tackle the functional access support as soon as I check in
our current group of outstanding patches.  I'm hoping to at least check in
the majority of them in the next couple of days.  Now that you've
completed the CacheMemory access changes, you may want to re-profile GEM5
and make sure the next performance bottleneck is routing network messages
in the Perfect Switch.  In particular, you'll want to look at rather large
(16+ core) systems using a standard Mesh network.  If you have any
questions on how to do that, Arka may be able to help you out, if not, I
can certainly help you.  Assuming the Perfect Switch shows up as a major
bottleneck (  10%),  then I would suggest that as the next area you can
work on.  When looking at possible solutions, don't limit yourself to just
changes within Perfect Switch itself.  I suspect that redesigning how
destinations are encoded and/or the interface between MessageBuffer
dequeues and the PerfectSwitch wakeup, will lead to a better solution.

Brad



-Original Message-
From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
Sent: Tuesday, January 18, 2011 6:59 AM
To: Beckmann, Brad
Cc: m5-dev@m5sim.org
Subject:

Hi Brad

Now that those changes to CacheMemory, SLICC and protocol files have
been pushed in, what's next that you think we should work on? I was
going
through some of the earlier emails. You have mentioned functional access
support in Ruby, design of the Perfect Switch, consolidation of stat
files.

Thanks
Nilay



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

2011-01-04 Thread Arkaprava Basu

Hi Nilay,

   On deadlock issue with MESI_CMP_directory :
   Yes,  this can happen as ruby_tester or Sequencer only reports 
*possible* deadlocks. With higher number of processors there is more 
contention (and thus latency) and it can mistakenly report deadlock. I 
generally look at the protocol trace to figure out whether there is 
actually any deadlock or not. You can also try doubling the Sequencer 
deadlock threshold and see if the problem goes away. If its a true 
deadlock, it will  break again.


On some related note,  as Brad has pointed out MESI_CMP_directory has 
its share of issues. Recently one of Prof. Sarita Adve's student 
e-mailed us (Multifacet) about 6 bugs he found while model checking the 
MESI_CMP_directory (including a major one). I took some time to look at 
them and it seems like MESI_CMP_directory is now fixed (hopefully).  The 
modified protocol is now passing 1M checks with 16 processors with 
multiple random seeds.  I can locally coordinate with you on this, if 
you want.


Thanks
Arka

On 01/04/2011 11:43 AM, Nilay Vaish wrote:



On 2011-01-03 15:31:20, Brad Beckmann wrote:

Hi Nilay,

First, I must say this is an impressive amount of work.  You definitely got a 
lot done over holiday break. :)

Overall, this set of patches is definitely close, but I want to see if we can 
take them a step forward.  Also I have a few suggestions that may make things 
easier.  Finally, I have a bunch of minor questions/suggestions on individual 
lines, but I’ll hold off on those until you can respond to my higher-level 
questions.

The main thing I would like to see improved is not having to differentiate 
between “entry” and “entry_ptr” in the .sm files.  Am I correct that the only 
functions in the .sm files that are passed an “entry_ptr” are “is_valid_ptr”, 
“getCacheEntry”, and “set_cache_entry”?  If so, it seems that all three 
functions are generated with unique python code, either in an AST file or 
StateMachine.py.  Therefore, could we just pass these functions “entry” and 
rely on the underneath python code to generate the correct references?  This 
would make things more readable, “is_valid_ptr()” becomes “is_valid”, and it 
doesn’t require the slicc programmer to understand which functions take an 
entry pointer versus the entry itself.  If we can’t make such a change, I worry 
about how much extra complexity this change pushes on the slicc programmer.

Also another suggestion to make things more readable, please replace the name 
L1IcacheMemory_entry with L1I_entry.  Do the same for L1D_entry and L2_entry.  
That will shorten many of your lines.

So am I correct that hammer’s simultaneous usage of valid L1 and L2 cache 
entries in certain transitions is the only reason that within all actions, the 
getCacheEntry calls take multiple cache entries?  If so, I think it would be 
fairly trivial to use a tbe entry as an intermediary between the L1 and L2 for 
those particular hammer transitions.  That way only one cache entry is valid at 
any particular time, and we can simply use the variable cache_entry in the 
actions.  That should clean things up a lot.

By the way, once you check in these patches, the MESI_CMP_directory protocol 
will be deprecated, correct?  If so, make sure you include a patch that removes 
it from the regression tester.

Brad


The main thing I would like to see improved is not having to differentiate
between “entry” and “entry_ptr” in the .sm files.  Am I correct
that the only functions in the .sm files that are passed an
“entry_ptr” are “is_valid_ptr”, “getCacheEntry”, and
“set_cache_entry”?  If so, it seems that all three functions are
generated with unique python code, either in an AST file or
StateMachine.py.  Therefore, could we just pass these functions
“entry” and rely on the underneath python code to generate the correct
references?  This would make things more readable, “is_valid_ptr()”
becomes “is_valid”, and it doesn’t require the slicc programmer to
understand which functions take an entry pointer versus the entry itself.
If we can’t make such a change, I worry about how much extra complexity
this change pushes on the slicc programmer.

There are functions that are passed cache entry and transaction buffer entry as 
arguments. Currently, I assume that these arguments are passed using pointers.


Also another suggestion to make things more readable, please replace the
name L1IcacheMemory_entry with L1I_entry.  Do the same for L1D_entry and
L2_entry.  That will shorten many of your lines.

The names of the cache entry variables are currently tied with the names of the 
cache memory variables belonging to the machine. If the name of the cache 
memory variable is A, then the corresponding cache entry variable is named 
A_entry.


So am I correct that hammer’s simultaneous usage of valid L1 and L2
cache entries in certain transitions is the only reason that within all
actions, the getCacheEntry calls take 

Re: [m5-dev] Fixing MESI CMP directory protocol

2011-01-04 Thread Arkaprava Basu

These are the following step I use:

1. First run with whatever default values of threshold are.
2. If deadlocked, take trace and try to find out is there evident reason 
for deadlock or not.

3. If no, double the default threshold value and run again.
4. If the same test passes with larger threshold, then it means the 
deadlock was actually not there. So life is good. If not, need to dig 
more into trace to see whats going on.


@Nilay:
By end of today, I will share with you the patch that seems like fixed  
that protocol.


Thanks
Arka

On 01/04/2011 12:51 PM, Nilay Vaish wrote:

What threshold do you use?

On Tue, 4 Jan 2011, Arkaprava Basu wrote:


Hi Nilay,

  On deadlock issue with MESI_CMP_directory :
  Yes,  this can happen as ruby_tester or Sequencer only reports 
*possible* deadlocks. With higher number of processors there is more 
contention (and thus latency) and it can mistakenly report deadlock. 
I generally look at the protocol trace to figure out whether there is 
actually any deadlock or not. You can also try doubling the Sequencer 
deadlock threshold and see if the problem goes away. If its a true 
deadlock, it will  break again.


On some related note,  as Brad has pointed out MESI_CMP_directory has 
its share of issues. Recently one of Prof. Sarita Adve's student 
e-mailed us (Multifacet) about 6 bugs he found while model checking 
the MESI_CMP_directory (including a major one). I took some time to 
look at them and it seems like MESI_CMP_directory is now fixed 
(hopefully).  The modified protocol is now passing 1M checks with 16 
processors with multiple random seeds.  I can locally coordinate with 
you on this, if you want.


Thanks
Arka


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Connecting to cache ports

2010-05-16 Thread Arkaprava Basu
Actually we looked into those files, especially AtomicSimpleCPU and 
Memtester, but we are getting confused on how the cache ports are 
getting connected properly. In our wrapper, we also had icacheport and 
dcacheport, but not sure from where we can register them ( i.e. get the 
Port::setPeer() called with proper parameters). We were guessing this 
might have been done through some python/swig stuff, but frankly we are 
getting lost somewhere. Any clue on this would really help us.


Thank you,
Arka  Rathijit  


Steve Reinhardt wrote:

You need to use Port objects for this connection, just like the real
CPUs do (and the memtester).  There isn't a lot of documentation on
the wiki, but I think the details are discussed in the tutorial.
Using the existing CPU or memtester code as an example is probably the
best route.  Let us know if you have any specific questions.

Steve

On Fri, May 14, 2010 at 10:16 PM, Arkaprava Basu aba...@wisc.edu wrote:
  

Hi,

   We are trying to connect a dummy cpu model to caches. So we require to 
connect the icache and dcache ports of this dummy cpu model to that of M5 
caches. Can anybody please tell us what is the best way to achieve this 
connection ?

Arka  Rathijit
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
  

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


[m5-dev] Connecting to cache ports

2010-05-14 Thread Arkaprava Basu
Hi,

We are trying to connect a dummy cpu model to caches. So we require to 
connect the icache and dcache ports of this dummy cpu model to that of M5 
caches. Can anybody please tell us what is the best way to achieve this 
connection ?

Arka  Rathijit
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev