Re: [gem5-dev] Ruby: Token Coherence and Functional Access

2011-06-13 Thread Beckmann, Brad
Yes, the token protocol is definitely one of those protocols that prevents us 
from tightly coupling the functional access support to the protocols.  However, 
I don't this issue will result in silently corrupted behavior.  Instead, it 
seems the result would be an error generated in the simulation, correct?  
Specifically in the example you mention, all controllers are in the stable 
Invalid state, right?  Therefore, the functional access won't find a valid 
block anywhere, and an error will be generated.  That seems like the right 
behavior to me.

Brad


 -Original Message-
 From: gem5-dev-boun...@m5sim.org [mailto:gem5-dev-
 boun...@m5sim.org] On Behalf Of Nilay Vaish
 Sent: Friday, June 10, 2011 8:50 AM
 To: gem5-dev@m5sim.org
 Subject: [gem5-dev] Ruby: Token Coherence and Functional Access
 
 Brad, in the token coherence protocol, the l2 cache controller moves from
 state O to I and sends data to the memory. I think this particular transition 
 is
 may pose a problem in enabling functional accesses for the protocol. The
 problem, I think, is that both the directory and the cache controller are in
 stable states even though there is data travelling in the network. This means
 that both the controllers will allow a functional write to go ahead. But then
 the data will be over written by the value sent from the l2 controller to the
 directory controller.
 
 My understanding of the protocol implementation is close to \epsilon. I think
 this is what I observed today in the morning. Do think this understanding is
 correct?
 
 --
 Nilay
 ___
 gem5-dev mailing list
 gem5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/gem5-dev


___
gem5-dev mailing list
gem5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/gem5-dev


Re: [m5-dev] Move of Garnet/Orion config file

2011-05-10 Thread Beckmann, Brad
Hi Korey,

We are in the process of moving all the Orion code out of Ruby and into McPAT.  
When that is complete, I suspect that router.cfg file will be removed.  Tushar, 
please correct me if I'm wrong.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Korey Sewell
 Sent: Tuesday, May 10, 2011 4:19 PM
 To: M5 Developer List
 Subject: [m5-dev] Move of Garnet/Orion config file
 
 There is one detail in Garnet/Orion that probably needs to be moved to
 python.
 
 It's the router.cfg file for the Orion stats which is hard coded into the
 C++ here:
 m5-ix-link-ruby/src/mem/ruby/network/orion/NetworkPower.cc:79:
 const string cfg_fn = src/mem/ruby/network/orion/router.cfg
 
 The contents of that file is just a bunch of values to be read by Orion when
 Garnet finishes.
 
 As it is now, this forces you to copy this long directory path whenever you 
 are
 exporting the build directory to run M5 on a cluster of machines.
 
 Tushar, is this something that you'd be willing to look into? Someone else?
 
 --
 - Korey
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Ruby: Correctly set access permissions for directory entries

2011-05-08 Thread Beckmann, Brad
The stats and regression tester should not need to be updated with this patch.  
This is purely a Ruby/SLICC mechanism change.

Brad

 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay Vaish
 Sent: Saturday, May 07, 2011 6:17 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Review Request: Ruby: Correctly set access
 permissions for directory entries
 
 Korey, I don't think there will be any change in the simulation
 performance. I am not sure about stats.
 
 Brad, were the stats updated after you made the change?
 
 --
 Nilay
 
 
 
 On Fri, 6 May 2011, Korey Sewell wrote:
 
  Nilay,
  can you explain the impact of that bug in terms of simulation
 performance?
  Are benchmarks running slower because of this change? Will
 regressions need
  to be updated?
 
  On Fri, May 6, 2011 at 8:13 PM, Beckmann, Brad
 brad.beckm...@amd.comwrote:
 
  Hi Nilay,
 
  Yeah, pulling the State into the Machine makes sense to me.  If I
 recall,
  my previous patch made it necessary that each machine included a
  state_declaration (previously the state enum).  More tightly
 integrating the
  state to the machine seems to be a natural progression on that path.
 
  I understand moving the permission settings back to setState is the
 easiest
  way to make this work.  However, it would be great if we could
 combine the
  setting of state and the setting of permission into one function
 call from
  the sm file.  Thus we don't have to worry about the situation where
 one sets
  the state, but forgets to set the permission.  That could lead to
 some
  random functional access failing and a very painful debug.
 
  Brad
 
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Ruby: Correctly set access permissions for directory entries

2011-05-06 Thread Beckmann, Brad
Hi Nilay,

Yeah, pulling the State into the Machine makes sense to me.  If I recall, my 
previous patch made it necessary that each machine included a state_declaration 
(previously the state enum).  More tightly integrating the state to the machine 
seems to be a natural progression on that path.

I understand moving the permission settings back to setState is the easiest way 
to make this work.  However, it would be great if we could combine the setting 
of state and the setting of permission into one function call from the sm file. 
 Thus we don't have to worry about the situation where one sets the state, but 
forgets to set the permission.  That could lead to some random functional 
access failing and a very painful debug.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Friday, May 06, 2011 3:52 PM
 To: Nilay Vaish; Default
 Subject: [m5-dev] Review Request: Ruby: Correctly set access permissions
 for directory entries
 
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/684/
 ---
 
 Review request for Default.
 
 
 Summary
 ---
 
 Ruby: Correctly set access permissions for directory entries The access
 permissions for the directory entries are not being set correctly.
 This is because pointers are not used for handling directory entries.
 function. The setState() function will once again set the permissions as well.
 But it would make use of the State_to_permission() function, instead of
 doing the analysis it used to do earlier. The changePermission() function
 provided by the AbstractEntry and AbstractCacheEntry classes has been
 exposed to SLICC code once again. The set_permission() functionality has
 been removed.
 
 I have done this only for the MESI protocol so far. Once we build a consensus
 one the changes, I will make changes to other protocols as well.
 
 As far as testing is concerned, the protocol compiles and clears 1 loads.
 I did not test any more than that.
 
 A point that I wanted to raise for discussion: I think we should pull State
 enum and the accompanying functions into the Machine it self. Brad, what
 do you think?
 
 
 Diffs
 -
 
   src/mem/protocol/MESI_CMP_directory-L1cache.sm 3c628a51f6e1
   src/mem/protocol/MESI_CMP_directory-L2cache.sm 3c628a51f6e1
   src/mem/protocol/MESI_CMP_directory-dir.sm 3c628a51f6e1
   src/mem/protocol/RubySlicc_Types.sm 3c628a51f6e1
   src/mem/slicc/ast/MethodCallExprAST.py 3c628a51f6e1
   src/mem/slicc/symbols/StateMachine.py 3c628a51f6e1
 
 Diff: http://reviews.m5sim.org/r/684/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Nilay
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick

2011-04-29 Thread Beckmann, Brad
I can't reproduce these scons errors and they don't seem to happen from a clean 
build.  Can we blow away the current build directory on zizzer and re-run the 
regression tester?  I would do it myself, but I don't have access to zizzer.

Thanks,

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Cron Daemon
 Sent: Friday, April 29, 2011 12:17 AM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression
 quick

 scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not
 found, needed by target
 `build/ALPHA_SE/python/m5/internal/param_RubyNetwork_wrap.cc'.
 scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not
 found, needed by target
 `build/ALPHA_SE/python/m5/internal/param_BaseGarnetNetwork_wrap.cc'
 .
 scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not
 found, needed by target
 `build/ALPHA_SE/python/m5/internal/param_Topology_wrap.cc'.
 scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not
 found, needed by target
 `build/ALPHA_SE/python/m5/internal/param_RubySystem_wrap.cc'.
 scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not
 found, needed by target
 `build/ALPHA_SE/python/m5/internal/param_GarnetNetwork_wrap.cc'.
 scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not
 found, needed by target
 `build/ALPHA_SE/python/m5/internal/param_SimpleNetwork_wrap.cc'.
 scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not
 found, needed by target
 `build/ALPHA_SE/python/m5/internal/param_GarnetNetwork_d_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed
 by target
 `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_RubyNetwo
 rk_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed
 by target
 `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_BaseGarnet
 Network_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed
 by target
 `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_Topology_w
 rap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed
 by target
 `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_RubySystem
 _wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed
 by target
 `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_GarnetNetw
 ork_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed
 by target
 `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_SimpleNetw
 ork_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed
 by target
 `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_GarnetNetw
 ork_d_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_RubyN
 etwork_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_BaseGa
 rnetNetwork_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_Topolo
 gy_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_RubySy
 stem_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_Garnet
 Network_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_Simple
 Network_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_Garnet
 Network_d_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MOESI_CMP_directory/python/m5/internal/param_Ruby
 Network_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MOESI_CMP_directory/python/m5/internal/param_Base
 GarnetNetwork_wrap.cc'.
 scons: *** Implicit dependency
 `build/ALPHA_SE_MOESI_CMP_directory/params/ExtLink.hh' not found,
 needed by target
 `build/ALPHA_SE_MOESI_CMP_directory/python/m5/internal/param_Topol
 

Re: [m5-dev] Code Reviewing

2011-04-27 Thread Beckmann, Brad
I suspect that my recent review posts motivated this thread.

Overall, I think that the policy that you suggested Nate has been our informal 
policy.  The reason why I posted my somewhat trivial changes to reviewboard 
this morning, is to give Tushar a chance to comment on my changes before I 
pushed them.  Also though one of my patches is a single line, it will require a 
new set of regression tester stats.  I felt that that kind of change needed to 
be highlighted by posting a review.

Maybe the best policy is to make better use of the -U and -G options of 
postreview.  I know I'm guilty of not using those options before, but I really 
should have specified -U tushar for those patches.  Right now if one doesn't 
specify -U or -G, it is sent to the default group (which is all of m5-dev, 
correct?) and gabe, ali, steve, and nate.  Even when -U is specified, it only 
adds the additional user to the list and doesn't overwrite gabe, ali, steve, 
and nate.  Instead, maybe we still send patches to the default group, but 
remove the current list of four users. That way we can better customize who are 
the explicit targets.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabriel Michael Black
 Sent: Wednesday, April 27, 2011 11:25 AM
 To: m5-dev@m5sim.org
 Subject: Re: [m5-dev] Code Reviewing
 
 That sounds reasonable. With too many reviews it gets harder to get to all of
 them, and some obscure things may languish with no reviews because only
 one person is comfortable with that code. Reviews are generally a really
 good thing but they have some overhead. If we don't get more benefit than
 that threshold, they aren't worth it in that case.
 
 Gabe
 
 Quoting nathan binkert n...@binkert.org:
 
  Hi Everyone,
 
  We don't have an official policy on code reviews, but I think we're
  being a bit pedantic with them.  While I definitely want us to err on
  the side of having code review is the author has any doubt, I think it
  is completely unnecessary to have reviews on things like changing
  comments and text in strings.  Similarly, obvious bug fixes (though
  this is one of those subjective things that the author has to
  consider) need not be reviewed.
 
  What do you all think?  What is our policy?  Am I crazy? Should we
  review everything?
 
 Nate
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Trace not working

2011-04-25 Thread Beckmann, Brad
Maybe I'm doing something stupid here, but on a  clean checkout, the following 
short patch encounters the subsequent compiler error:

diff --git a/src/mem/SConscript b/src/mem/SConscript
--- a/src/mem/SConscript
+++ b/src/mem/SConscript
@@ -57,6 +57,7 @@
 TraceFlag('BusAddrRanges')
 TraceFlag('BusBridge')
 TraceFlag('LLSC')
+TraceFlag('FlagCheck')
 TraceFlag('MMU')
 TraceFlag('MemoryAccess')
 
diff --git a/src/mem/port.cc b/src/mem/port.cc
--- a/src/mem/port.cc
+++ b/src/mem/port.cc
@@ -106,6 +106,7 @@
 Port::setPeer(Port *port)
 {
 DPRINTF(Config, setting peer to %s\n, port-name());
+DPRINTF(FlagCheck, check setting peer to %s\n, port-name());
 
 peer = port;
 }


Error:

scons: Building targets ...
 [ CXX] X86_SE_MOESI_hammer/mem/port.cc - .do
build/X86_SE_MOESI_hammer/mem/port.cc: In member function 'virtual void 
Port::setPeer(Port*)':
build/X86_SE_MOESI_hammer/mem/port.cc:109: error: 'FlagCheck' is not a member 
of 'Debug'
 [SWIG] X86_SE_MOESI_hammer/python/m5/internal/vptype_IntLink.i - 
_wrap.cc, .py
 [SWIG] X86_SE_MOESI_hammer/python/m5/internal/vptype_AddrRange.i - 
_wrap.cc, .py
scons: *** [build/X86_SE_MOESI_hammer/mem/port.do] Error 1
 [SWIG] X86_SE_MOESI_hammer/python/m5/internal/vptype_Process.i - 
_wrap.cc, .py
scons: building terminated because of errors.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of nathan binkert
 Sent: Monday, April 25, 2011 5:29 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Trace not working
 
  However, I am confused as well on how to add a new
 TraceFlag/DebugFlag.  It seems that all the previous flags are still
 specified using the TraceFlag() function, but I can't seem to be able
 to specify a new one.    Also to be consistent, should we change the
 name of the TraceFlag function to DebugFlag?
 
 You should still use the TraceFlag function in SCons.  Are you sure
 this doesn't work?  And yes, I should probably rename TraceFlag to
 DebugFlag.
 
   Nate
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] what scons can do

2011-04-22 Thread Beckmann, Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of nathan binkert
 Sent: Thursday, April 21, 2011 5:53 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] what scons can do
 
  Maybe so... I think there's a subconscious impression that it takes a
  while because there's a phase in the build that takes a noticeable
  amount of time and that's all the output you see.  If in fact that
  delay is 10% running SLICC and 90% scons doing other stuff silently
  then I agree it's not such a big deal.
 I think it's more like 1%/99% :)

Is that 1%/99% a statement of a clean build for m5.fast?  I think the much more 
common case is you edit one .cc file and rebuild.  In that situation, it sure 
seems like a lot more than 1% of the time is spent by scons regenerating and 
reanalyzing SLICC files.

Whatever it may be, it sure would be great if we could speed things up.  I'm 
happy to help however I can.

Brad


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Defining Miss Latencies in Ruby

2011-04-20 Thread Beckmann, Brad
Hi Korey,

I'm confused.  The miss_latency calculated by the sequencer is the miss latency 
of the particular request, not just L1 cache hits.

If you're seeing a bunch of minimum latency requests, I suspect something else 
is wrong.  For instance, is issued_time a cycle value or a tick value?

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Korey Sewell
 Sent: Wednesday, April 20, 2011 11:38 AM
 To: M5 Developer List
 Subject: [m5-dev] Defining Miss Latencies in Ruby
 
 Hi all,
 I've been working on miss latencies and stats in Ruby Caches and I noticed
 something that might be a bug in tracking miss stats.
 
 The code in Sequencer.cc has the following check for looking at a miss:
 Time miss_latency = g_eventQueue_ptr-getTime() - issued_time;
 
 // Profile the miss latency for all non-zero demand misses
 if (miss_latency != 0) {
   track miss stats
 }
 
 
 Should this not instead be L1_cache_latency or 2 * L1_cache_latency  (if
 it has to be buffered both ways)???
 
 The effect of this I think is a saturation of the miss latency histogram in 
 the
 1st bucket.
 
 If anyone has any thoughts, let me know, as I could be missing something
 here ... :)
 
 --
 - Korey
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Defining Miss Latencies in Ruby

2011-04-20 Thread Beckmann, Brad
Sure, it is recording all miss latencies, including L1 cache hits.  And yes, 
those L1 hits will show up in the first bucket.  However, I don't see how that 
is a bug.  If you don't want to include L1 hits in the histogram, then look how 
the MOESI_hammer protocol tracks separate miss latencies depending on the 
responding machine type.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Korey Sewell
 Sent: Wednesday, April 20, 2011 7:20 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Defining Miss Latencies in Ruby
 
 (comments inline)
 
  I'm confused.  The miss_latency calculated by the sequencer is the
 miss
  latency of the particular request, not just L1 cache hits.
 
 I think I understand that, but even if it's just L1 hits, let's say
 that the
 L1 latency is 1 and you are running a workload with a high hit rate in
 the
 L1s. Then doesnt the code then continuously record that L1 hit in the
 1st
 histogram bucket? This would definitely be the case for L1 latencies of
 the
 more than 1, since it's hardcoded to record everything of a latency  0
 (basically all requests), right?
 
 
 
  If you're seeing a bunch of minimum latency requests, I suspect
 something
  else is wrong.
 
  For instance, is issued_time a cycle value or a tick value?
 
 The issued_time is the cycles, as it is set in the makeRequest(),
 Sequencer
 function when a new Request is built.
 --
 - Korey
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Ruby: Add support for functional accesses

2011-04-14 Thread Beckmann, Brad
Hi Nilay,

Let me start off by saying that I'm not sure if I fully understand the 
complexities of dealing with functional accesses from the PioPort and I could 
be overlooking a key concern.  

I think implementing functional accesses for the PioPort should be very similar 
to cpu functional accesses.  We still need to include the pio_port within the 
RubyPort and we need to send all requests not directed at physical memory to 
it.  You need to modify the connectX86RubySystem function in FSConfig.py so 
that all pio functional requests are seen by the ruby port versus physmem.  The 
more difficult piece maybe moving the address range registration functionality 
from physmem to RubyPort, since physmem will no longer exist.  If you run into 
difficulties there, I encourage you to send email to the dev list, since others 
will be better resources than me.

Is that the kind of information you were looking for?

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Thursday, April 14, 2011 9:25 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Review Request: Ruby: Add support for functional
 accesses
 
 Brad, can you elaborate on implementing functional accesses for the
 PioPort?
 
 --
 Nilay
 
 On Wed, 13 Apr 2011, Beckmann, Brad wrote:
 
  I just reviewed it.  Please let me know if you have any questions.
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Nilay Vaish
  Sent: Tuesday, April 12, 2011 4:39 PM
  To: Default
  Subject: Re: [m5-dev] Review Request: Ruby: Add support for
  functional accesses
 
  Brad, can you take a look at the patch? I think we are now in
  position to implement functional accesses for the PioPort.
 
  --
  Nilay
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Ruby: Add support for functional accesses

2011-04-13 Thread Beckmann, Brad
I just reviewed it.  Please let me know if you have any questions.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Tuesday, April 12, 2011 4:39 PM
 To: Default
 Subject: Re: [m5-dev] Review Request: Ruby: Add support for functional
 accesses
 
 Brad, can you take a look at the patch? I think we are now in position to
 implement functional accesses for the PioPort.
 
 --
 Nilay
 
 
 
 On Tue, 12 Apr 2011, Nilay Vaish wrote:
 
 
  ---
  This is an automatically generated e-mail. To reply, visit:
  http://reviews.m5sim.org/r/611/
  ---
 
  (Updated 2011-04-12 16:35:34.866577)
 
 
  Review request for Default.
 
 
  Summary (updated)
  ---
 
  Ruby: Add support for functional accesses This patch is meant for
  aiding discussions on implementation of functional access support in
  Ruby.
 
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] AccessPermission in AbstractEntry

2011-04-11 Thread Beckmann, Brad
Hi Nilay,

Yes, that is a good point.  We really just need the interface to the permission 
to be available from AbstractEntry.  The variable itself doesn't really need to 
be there.  However, to make that change, you'll need to modify how CacheMemory 
supports atomics.

Could you elaborate on your directory controller question.  I suspect that you 
are right and that only one type of directory controller can exist in a system, 
but why is that a problem?

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Sunday, April 10, 2011 2:12 AM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] AccessPermission in AbstractEntry
 
 Brad, it seems like the m_Permission variable in AbstractEntry is not being
 used at all. In order to get AccessPermission for a state, the
 state_To_AccessPermission function needs to be called. Then, why have that
 variable? And this would mean that CacheMemory has no idea about the
 access permission, unless we expose the state to Cache Memory class.
 
 Also, as it now stands, it seems one cannot have two different types of
 directory controllers in a system. Is this correct? If yes, then why this
 restriction?
 
 --
 Nilay
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] AccessPermission in AbstractEntry

2011-04-11 Thread Beckmann, Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Monday, April 11, 2011 2:38 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] AccessPermission in AbstractEntry

 
  Could you elaborate on your directory controller question.  I suspect
  that you are right and that only one type of directory controller can
  exist in a system, but why is that a problem?
 
 
 Is it not possible that we have a protocol in which different directory
 controllers may behave differently?

Ok, I just had a chance to look at the code and I think you a referring to the 
lack of a directory MACHINETYPE macro in RubySlicc_ComponentMapping.hh.  Is 
that correct?  Ideally, there shouldn't be a problem with adding any 
arbitrarily named controller to Ruby, as long as you incorporate the right 
component mapping functions into the protocol.  However, in practice I suspect 
it will take some non-trivial amount of modifications to  
RubySlicc_ComponentMapping.hh.  Also you'll need to be careful how Ruby and 
generate SLICC code uses the auto generated MachineType functions.  There may 
be some tricky issues there.

Overall, I can't really provide you a lot of specifics on why the directory 
MACHINETYPE macro does not exist.  There may have been some assumptions behind 
that that were relevant in GEMS but are no longer valid in gem5.  I would grep 
through the Ruby and generated code for the MachineType functions to fully 
understand the ramifications.

Brad


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Interpreting and Fixing Ruby Stats??

2011-04-07 Thread Beckmann, Brad

  I realize the documentation is still under way for gem5, but I was
  wondering if there are any plans to document how users should  be
  interpreting the Ruby stats file? (Particularly, the miss latency
  histograms)

Not all protocols support the miss latency histograms.  Specifically, I believe 
only the hammer protocol supports all of them.  Do you have any particular 
questions about them?

 
  Did people come to the conclusion that it is a good idea to have a
  separate files for ruby stats v. m5 stats (if so, sorry for the extra 
  question)?
 

I'm not sure we have a final decision on that, but Derek is the one currently 
looking into it.

  Additionally, is there an update on any plans to add descriptions and
  do stats accounting for the various cache memories? To my surprise, I
  always get this output for any CacheMemory in Ruby:
  Cache Stats: system.l1_cntrl0.L1IcacheMemory
    system.l1_cntrl0.L1IcacheMemory_total_misses: 0
    system.l1_cntrl0.L1IcacheMemory_total_demand_misses: 0
    system.l1_cntrl0.L1IcacheMemory_total_prefetches: 0
    system.l1_cntrl0.L1IcacheMemory_total_sw_prefetches: 0
    system.l1_cntrl0.L1IcacheMemory_total_hw_prefetches: 0
 
  For now it looks like I'll have to derive some pseudo-information from
  the Cache event and transition counts OR go in and try to hack in some
  of these stats myself. But ideally, I would say one aggregated stat
  file where I could grep out about cpu and detailed memory stats (i.e.
  what about mshr miss/hit counts?) would be awesome.
 
  If all this stuff is there, please excuse my ignorance, but if not,
  would someone be so kind to provide a brief update of what's going on
 with this?

The stats you mentioned are supported by some protocols but not others.  
Basically those protocols that do support them, include special actions that 
increment these stats.  In my opinion, it is really hard to make these stats 
protocol agnostic, but you're welcome to propose a methodology that does.

Brad

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Running Ruby w/32 Cores

2011-04-06 Thread Beckmann, Brad
Hi Korey,

Yes, let's move this conversation back to m5-dev, since I think others may be 
interested and could help.

I don't know what the problem is exactly, but at some point of time (probably 
back in the early GEMS days) I seem to remember the Set code included an 
assertion check about the 31st bit in 32-bit mode.  Therefore, I think we knew 
about this problem and made sure that never happened.  I believe that is why we 
used to have a restriction that Ruby could only support 16 processors.  I'm 
really fuzzy on the details...maybe someone else can elaborate.

In the end, I just want to make sure we add something in the code that makes 
sure we don't encounter this problem again.  This is one of those bugs that can 
take a while to track down, if you don't catch it right when it happens with an 
assertion.

Brad



From: koreylsew...@gmail.com [mailto:koreylsew...@gmail.com] On Behalf Of Korey 
Sewell
Sent: Tuesday, April 05, 2011 7:14 AM
To: Beckmann, Brad
Subject: Re: [m5-dev] Running Ruby w/32 Cores

Hi again Brad,
I looked this over again and although my 32-bit patch fixes things, now that 
I look at it again, I'm not convinced that I actually fixed the symptom of the 
bug but rather the cause of the bug.

Do you happen to know what are the problems with the 32-bit Set counts?

Sorry for prolonging the issue, but I thought I had put this to bed but  maybe 
not. Finally, it may not matter that this works on 32-bit machines but it'd be 
nice if it did. (Let me know if I should move this convo to the m5-dev list)

I end up checking the last bit in the count function manually (the code as 
follows):
int
Set::count() const
{
int counter = 0;
long mask;

for (int i = 0; i  m_nArrayLen; i++) {
mask = (long)0x01;

for (int j = 0; j  LONG_BITS; j++) {
// FIXME - significant performance loss when array
// population  LONG_BITS
if ((m_p_nArray[i]  mask) != 0) {
counter++;
}
mask = mask  1;
}

#ifndef _LP64
long msb_mask = 0x8000;
if ((m_p_nArray[i]  msb_mask) != 0) {
counter++;
}
#endif
}

return counter;
}
On Tue, Apr 5, 2011 at 1:30 AM, Korey Sewell 
ksew...@umich.edumailto:ksew...@umich.edu wrote:
Brad, it  looks like you were right on the money here. I found the spot where 
it was returning the wrong value via a SLICC function to count sharers for 
everyone except the owner.

I realized that the machine that I use for testing is just a 32-bit machine, 
and like you warned there look to be issues with the Set type there. I ran the 
Fft-32 cores on a 64-bit machine and it seems to work correctly. I'll be 
running on the full splash/parsec suites soon and that should stress Ruby a 
good bit :).

I have a patch that checks to see if _LP64 is defined, and if not check that 
last bit when doing the set count function.

Thanks for being helpful in debugging. It was a relatively easy bug, but as 
always going through code and becoming more proficient at getting around while 
trying to solve a bug is really helpful.

On Fri, Apr 1, 2011 at 7:28 PM, Beckmann, Brad 
brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote:
Ok for the first trace, the critical line is the following:

348523   0L2Cache L1_GETX  ILOSXIFLXO  [0x16180, line 0x16180] 
[NetDest (4) 0  - 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 
1  - 0 0  - 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0  - 
]30

L2Cache identifies that 31 caches have a shared copy and that L1 cache 9 (L1-9) 
is the owner.
When L1Cache 0 (L1-0) issues a GETX, the L2Cache issues 30 Inv probes, forwards 
the GETX to L1-9, and sends an ack to L1-0 itself.
However, the L2 cache tells L1-0 to expect only 30 acks instead of 31.  It 
could be something wrong with the NetDest::count() function, or the 
Set::count() function?  I slightly modified my previous patch to isolate on 
what value the NetDest::count() function is returning.  If it is returning 30, 
instead of 31, then it must be a problem with NetDest.  You are compiling gem5 
as a 64-bit binary, right?

The second problem is essentially the same issue.  L2Cache 31 (L2-31) is the 
owner of the block, but I suspect NetDest is not counting bit 31 and thus it is 
returning a count of 0...causing the error.

Overall, concentrate on that NetDest::count function, or more importantly the 
Set::count() function.  Once you find out the problem, please let me know.

Thanks,

Brad


From: koreylsew...@gmail.commailto:koreylsew...@gmail.com 
[mailto:koreylsew...@gmail.commailto:koreylsew...@gmail.com] On Behalf Of 
Korey Sewell
Sent: Friday, April 01, 2011 12:00 PM
To: Beckmann, Brad

Subject: Re: [m5-dev] Running Ruby w/32 Cores

Brad,
attached are the protocol traces grep'd for the offending addresses. I'm going 
to spend the weekend digging through Ruby code so hopefully I'm pretty close to 
generating the fixes myself

Re: [m5-dev] ruby_mem_tester.py

2011-04-01 Thread Beckmann, Brad
Thanks for pointing this out.  The hammer protocol included a optimization for 
uniprocessor DMA that was probably was just too aggressive to be worth the 
complexity.  The optimization broke when I fixed another DMA bug in the 
protocol last week, but I failed realize that since I offend don't think about 
uniprocessor scenarios.  Rather than try to revive the optimization, I'm just 
going to remove it.

Patch is forthcoming.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Lisa Hsu
 Sent: Thursday, March 31, 2011 5:39 PM
 To: M5 Developer List
 Subject: [m5-dev] ruby_mem_tester.py
 
 Hi all,
 
 As I prepared to push a bunch of stuff today I found that the following
 command line fails at the head of the the clean tree:
 
 ALPHA_SE_MOESI_hammer/m5.debug configs/example/ruby_mem_test.py
 -l 1000 --num-dma 2
 
 I pushed my changes anyway because they didn't make any difference on
 this error, but I've never run ruby_mem_test before, haven't worked with
 DMA sequencers, and am not particularly cozy with MOESI_hammer, and
 was wondering if this was known or unknown, expected or unexpected.  I
 presume unknown and unexpected.
 
 The error is an invalid transition from MI with event Writeback_Nack.  It
 seems to occur anytime --num-dma  1.  Is this a big concern?  Should I add
 this to flyspray?
 
 Lisa
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Ruby: Add support for functional accesses

2011-03-30 Thread Beckmann, Brad
Hi Nilay,

Thanks for posting a new patch.  I will review it as soon as I can...hopefully 
tonight.

Brad

 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Wednesday, March 30, 2011 4:32 PM
 To: Default
 Subject: Re: [m5-dev] Review Request: Ruby: Add support for functional
 accesses
 
 On Tue, 29 Mar 2011, Nilay Vaish wrote:
 
  Brad, I have posted on the review board my current implementation for
  supporting functional accesses in Ruby. This is untested and is mainly
  meant for furthering the discussions. I have some questions for you --
 
  1. How do we inform the other end of RubyPort's M5 Port about whether
  or not functional access was successful?
 
  2. What's the role of directory memory in functional accesses?
 
  3. If none of the caches have the block pertaining to the address of
  the access, then read accesses should be satisfied from the physical
 memory.
  Write accesses should always go to physical memory as well. How can
  physical memory be accessed from RubyPort?
 
  --
  Nilay
 
 
 
 Brad, I have made some changes to the patch. I have updated it on the
 review board. I have added a call to sendFunctional() so as to send the
 response. I have also added call to sendFunctional() on the physical memory
 port of ruby port, so that the physical memory would also get updated.
 
 You had mentioned that we would unhook M5 memory and use Ruby to
 supply the data. How do we do this? And the second question from the
 previous mail still remains unanswered.
 
 Thanks
 Nilay
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Running Ruby w/32 Cores

2011-03-30 Thread Beckmann, Brad
Hi Korey,

For the first trace, it looks like the L2 cache is either miscounting the 
number of valid L1 copies, or there is an error with the ack arithmetic.  We 
are going to need a bit more information to figure out where the exact problem 
is.  Could you apply the attached patch and reply with the new protocol trace?  
Thanks.

For the second trace, you should be able to get the offending address by simply 
attaching GDB to the aborted process.  Without knowing which address to zero in 
on, it is the proverbial  finding a needle in a haystack.

Thanks,

Brad



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Korey Sewell
 Sent: Tuesday, March 29, 2011 10:15 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Running Ruby w/32 Cores
 
 Thanks for the response Brad.
 
 The 1st trace has 1 L2 and the 2nd has 1 L2 (i had a typo in the original 
 email).
 
 For each trace, I attach the stdout/stderr (*.out) and then the protocol trace
 (*.prottrace).
 
 Also, in the 1st trace, the offending address is clear and I isolate that in 
 the
 protocol trace file provided. However, in the 2nd trace, it's unclear 
 (currently)
 which access caused it to fail so I took the whole protocol trace file and 
 gzip'd
 it.
 
 My current lack of expertise in SLICC limits me a bit, but I'd like to be more
 helpful in debugging so if there is anything that I can look into (or run) on 
 my
 end to expedite the process, please advise. In the interim, I'll try to locate
 the exact address that's breaking trace 2 and then hopefully repost that.
 
 Thanks!
 
 -Korey
 
 On Tue, Mar 29, 2011 at 12:02 PM, Beckmann, Brad
 brad.beckm...@amd.com wrote:
  Hi Korey,
 
  I believe both of these issues should be easy to solve once we have a
 protocol trace leading up to the error.  If you could create such a trace and
 send it to the list, that would be great.  Just zero in on the offending 
 address.
 
  Thanks,
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Korey Sewell
  Sent: Tuesday, March 29, 2011 8:11 AM
  To: M5 Developer List
  Subject: [m5-dev] Running Ruby w/32 Cores
 
  Hi All,
  I'm still having a bit of trouble running Ruby with 32+ cores. I am
  experimenting w/configs varying the l2-caches. The runs seems to
  generate various errors in the SLICC.
 
  Has anybody seen these or have any insight to how to start solving
  these type of issues (posted below)?
  =
  The command line and errors are as follows:
  (1) 32 Cores and 32 L2s
  build/ALPHA_FS_MOESI_CMP_directory/m5.opt
  configs/example/ruby_fs.py -b FftBase32 -n 32 --num-dirs=32 --num-
  l2caches=32 ...
  info: Entering event queue @ 0.  Starting simulation...
  Runtime Error at MOESI_CMP_directory-dir.sm:155, Ruby Time: 38279:
  assert failure, PID: 5990
  press return to continue.
 
  Program aborted at cycle 19139500
  Aborted
 
  (2) 32 Cores and 1 L2
  build/ALPHA_FS_MOESI_CMP_directory/m5.opt
  configs/example/ruby_fs.py -b FftBase32 -n 32 --num-dirs=32 --num-
  l2caches=32 ...
  fatal: Invalid transition
  system.l1_cntrl0 time: 349075 addr: [0x16180, line 0x16180] event: Ack
 state:
  MM  @ cycle 174537500
 
 [doTransitionWorker:build/ALPHA_FS_MOESI_CMP_directory/mem/protoc
  ol/L1Cache_Transitions.cc,
  line 477]
  Memory Usage: 2316756 KBytes
  For more information see: http://www.m5sim.org/fatal/23f196b2
  
 
  Please let me know if you do...Thanks!
 
  --
  - Korey
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
 
 
 --
 - Korey
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Running Ruby w/32 Cores

2011-03-29 Thread Beckmann, Brad
Hi Korey,

I believe both of these issues should be easy to solve once we have a protocol 
trace leading up to the error.  If you could create such a trace and send it to 
the list, that would be great.  Just zero in on the offending address.

Thanks,

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Korey Sewell
 Sent: Tuesday, March 29, 2011 8:11 AM
 To: M5 Developer List
 Subject: [m5-dev] Running Ruby w/32 Cores
 
 Hi All,
 I'm still having a bit of trouble running Ruby with 32+ cores. I am
 experimenting w/configs varying the l2-caches. The runs seems to generate
 various errors in the SLICC.
 
 Has anybody seen these or have any insight to how to start solving these
 type of issues (posted below)?
 =
 The command line and errors are as follows:
 (1) 32 Cores and 32 L2s
 build/ALPHA_FS_MOESI_CMP_directory/m5.opt
 configs/example/ruby_fs.py -b FftBase32 -n 32 --num-dirs=32 --num-
 l2caches=32 ...
 info: Entering event queue @ 0.  Starting simulation...
 Runtime Error at MOESI_CMP_directory-dir.sm:155, Ruby Time: 38279:
 assert failure, PID: 5990
 press return to continue.
 
 Program aborted at cycle 19139500
 Aborted
 
 (2) 32 Cores and 1 L2
 build/ALPHA_FS_MOESI_CMP_directory/m5.opt
 configs/example/ruby_fs.py -b FftBase32 -n 32 --num-dirs=32 --num-
 l2caches=32 ...
 fatal: Invalid transition
 system.l1_cntrl0 time: 349075 addr: [0x16180, line 0x16180] event: Ack state:
 MM  @ cycle 174537500
 [doTransitionWorker:build/ALPHA_FS_MOESI_CMP_directory/mem/protoc
 ol/L1Cache_Transitions.cc,
 line 477]
 Memory Usage: 2316756 KBytes
 For more information see: http://www.m5sim.org/fatal/23f196b2
 
 
 Please let me know if you do...Thanks!
 
 --
 - Korey
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] changeset in m5: ruby: fixed cache index setting

2011-03-26 Thread Beckmann, Brad
Thanks!

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Steve Reinhardt
 Sent: Saturday, March 26, 2011 1:54 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] changeset in m5: ruby: fixed cache index setting
 
 I can do it... just wanted to make sure it was expected and not an
 actual bug.
 
 On Sat, Mar 26, 2011 at 1:46 PM, Beckmann, Brad brad.beckm...@amd.com
 wrote:
  Hi Steve,
 
  Oops.  It was such a small change in configuration, I didn't think
 about rerunning the regression tester, but now thinking about it, yes
 it could impact the results.  The cache indexing functions were not
 using the right bits before this change.
 
  I can go ahead and update the stats tonight.  However, let me know if
 it is more convenient for you to update them yourself.
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
  Behalf Of Steve Reinhardt
  Sent: Saturday, March 26, 2011 6:20 AM
  To: M5 Developer List
  Subject: Re: [m5-dev] changeset in m5: ruby: fixed cache index
 setting
 
  Hi Brad,
 
  Would you expect this to change the results for the ruby regressions
  slightly?  The regressions passed last night because the tests
 didn't
  actually get rerun (since scons doesn't see the config file as a
  dependency), but I'm seeing some failures in the tip on tests I'm
  running and I suspect it's due to this change.
 
  Steve
 
  On Fri, Mar 25, 2011 at 10:15 AM, Brad Beckmann
 brad.beckm...@amd.com
  wrote:
   changeset d8587c913ccf in /z/repo/m5
   details: http://repo.m5sim.org/m5?cmd=changeset;node=d8587c913ccf
   description:
          ruby: fixed cache index setting
  
   diffstat:
  
    configs/ruby/MESI_CMP_directory.py  |  17 +++--
    configs/ruby/MI_example.py          |   4 +++-
    configs/ruby/MOESI_CMP_directory.py |  17 +++--
    configs/ruby/MOESI_CMP_token.py     |  15 +--
    configs/ruby/MOESI_hammer.py        |  10 +++---
    5 files changed, 41 insertions(+), 22 deletions(-)
  
   diffs (207 lines):
  
   diff -r bbab80b639cb -r d8587c913ccf
  configs/ruby/MESI_CMP_directory.py
   --- a/configs/ruby/MESI_CMP_directory.py        Fri Mar 25
 00:46:14
  2011 -0400
   +++ b/configs/ruby/MESI_CMP_directory.py        Fri Mar 25
 10:13:50
  2011 -0700
   @@ -68,15 +68,19 @@
       # Must create the individual controllers before the network to
  ensure the
       # controller constructors are called before the network
  constructor
       #
   +    l2_bits = int(math.log(options.num_l2caches, 2))
   +    block_size_bits = int(math.log(options.cacheline_size, 2))
  
       for i in xrange(options.num_cpus):
           #
           # First create the Ruby objects associated with this cpu
           #
           l1i_cache = L1Cache(size = options.l1i_size,
   -                            assoc = options.l1i_assoc)
   +                            assoc = options.l1i_assoc,
   +                            start_index_bit = block_size_bits)
           l1d_cache = L1Cache(size = options.l1d_size,
   -                            assoc = options.l1d_assoc)
   +                            assoc = options.l1d_assoc,
   +                            start_index_bit = block_size_bits)
  
           cpu_seq = RubySequencer(version = i,
                                   icache = l1i_cache,
   @@ -91,9 +95,7 @@
                                         sequencer = cpu_seq,
                                         L1IcacheMemory = l1i_cache,
                                         L1DcacheMemory = l1d_cache,
   -                                      l2_select_num_bits = \
   -
   math.log(options.num_l2caches,
   -                                                 2))
   +                                      l2_select_num_bits =
 l2_bits)
  
           exec(system.l1_cntrl%d = l1_cntrl % i)
  
   @@ -103,12 +105,15 @@
           cpu_sequencers.append(cpu_seq)
           l1_cntrl_nodes.append(l1_cntrl)
  
   +    l2_index_start = block_size_bits + l2_bits
   +
       for i in xrange(options.num_l2caches):
           #
           # First create the Ruby objects associated with this cpu
           #
           l2_cache = L2Cache(size = options.l2_size,
   -                           assoc = options.l2_assoc)
   +                           assoc = options.l2_assoc,
   +                           start_index_bit = l2_index_start)
  
           l2_cntrl = L2Cache_Controller(version = i,
                                         L2cacheMemory = l2_cache)
   diff -r bbab80b639cb -r d8587c913ccf configs/ruby/MI_example.py
   --- a/configs/ruby/MI_example.py        Fri Mar 25 00:46:14 2011 -
  0400
   +++ b/configs/ruby/MI_example.py        Fri Mar 25 10:13:50 2011 -
  0700
   @@ -60,6 +60,7 @@
       # Must create the individual controllers before the network to
  ensure the
       # controller constructors are called before the network
  constructor

Re: [m5-dev] Debugging Ruby Deadlocks...

2011-03-23 Thread Beckmann, Brad
Hi Korey,

A few comments:

- The difference in time is because the sequencer prints out the RubyCycle 
count for the issue time while the tick count is the global curTick value.  Now 
that Ruby uses DPRINTFs, I think it makes sense to move all those ruby print 
outs to ticks.  I actually have that on my long todo list, but I'm sure I won't 
get to it soon.  If you want to go ahead and make the conversion, please do.

- I'm pretty sure that Invalid range error is completely unrelated.  Instead 
that sort of error is caused when you try to print out a MachineType variable, 
but the value has not been set.  Typically that happens to network messages 
where the enqueue operations don't fill in all the fields.

- Overall, when tracking down a deadlock issue, start with the protocol trace 
and track down what is happening with the particular address in question.  From 
there, you can typically get an idea of what to zero in on.

- By the way, have you had a chance to confirm that my patches from this 
weekend fixed your previous dma problem?

Brad



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Korey Sewell
 Sent: Wednesday, March 23, 2011 12:43 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Debugging Ruby Deadlocks...
 
 This problem may be related to my earlier post i sent today about
 Debugging Ruby Deadlocks, but when adding the RubyQueue traceflag
 for 32 cores you get this error:
 build/ALPHA_FS_MOESI_CMP_directory/m5.opt -d ruby_opt/ --trace-
 flags=RubyQueue configs/example/ruby_fs.py -b fft_64t_base -n 1 ...
 panic: Invalid range for type MachineType  @ cycle 793500
 [MachineType_to_string:build/ALPHA_FS_MOESI_CMP_directory/mem/pro
 tocol/MachineType.cc,
 line 42]
 Memory Usage: 2312860 KBytes
 For more information see: http://www.m5sim.org/panic/f419fb7
 Program aborted at cycle 793500
 Aborted
 
 
 I thought this was solved a while ago by a previous patch but it seems to be
 an issue again. Is it something in the SLICC that isnt being defined properly
 when the core count is large? If anybody has any thoughts , please let me
 know so we can patch it and push the changeset in the tree.
 
 Note: I dont get this problem when running for just 1 core.
 --
 - Korey
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick

2011-03-22 Thread Beckmann, Brad
I sent Tushar an email this morning regarding this, hoping to catch him before 
he went to bed (he's currently in Singapore).  Unfortunately he hasn't 
responded.

Hopefully he'll get to this when he wakes up in a few hours.  If he doesn't, 
I'll take a look at it tomorrow morning.  I don't have time to do it today.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabe Black
 Sent: Tuesday, March 22, 2011 1:04 PM
 To: m5-dev@m5sim.org
 Subject: Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-
 regression quick
 
 You may already be taking care of this, but networktest.cc also had an error
 (ambiguous use of the pow function) that made all the regressions fail. That
 needs to be fixed quickly, regardless of what happens with the warnings or
 who originally worked on the code. Also, code that doesn't compile should
 really never have been committed in the first place. It couldn't have been
 tested since it couldn't have been run.
 
 Gabe
 
 On 03/22/11 15:46, Nilay Vaish wrote:
  On Tue, 22 Mar 2011, nathan binkert wrote:
 
  The warnings related to networktest.cc got added yesterday.
 
  Tushar should take care of the warnings related to networktest.cc.
 
 
  
 
  These I think have been around for quite a while.
 
  Either way, we should be eliminating warnings.
 
 
  I will commit a patch to eliminate the Sequencer related warnings.
 
  --
  Nilay
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Ruby: Convert AccessModeType to RubyAccessMode

2011-03-22 Thread Beckmann, Brad
Hi Nilay,

Why do you want to change the name?  Both names seem equivalent to me.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay Vaish
 Sent: Friday, March 18, 2011 9:55 PM
 To: Nilay Vaish; Default
 Subject: [m5-dev] Review Request: Ruby: Convert AccessModeType to
 RubyAccessMode
 
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/601/
 ---
 
 Review request for Default.
 
 
 Summary
 ---
 
 Ruby: Convert AccessModeType to RubyAccessMode
 This patch converts AccessModeType to RubyAccessMode so that both the
 protocol dependent and independent code uses the same access mode.
 
 
 Diffs
 -
 
   src/cpu/testers/rubytest/Check.hh 9a6a02a235f1
   src/cpu/testers/rubytest/Check.cc 9a6a02a235f1
   src/mem/protocol/MESI_CMP_directory-msg.sm 9a6a02a235f1
   src/mem/protocol/MOESI_CMP_directory-msg.sm 9a6a02a235f1
   src/mem/protocol/MOESI_CMP_token-L1cache.sm 9a6a02a235f1
   src/mem/protocol/MOESI_CMP_token-dir.sm 9a6a02a235f1
   src/mem/protocol/MOESI_CMP_token-msg.sm 9a6a02a235f1
   src/mem/protocol/RubySlicc_Exports.sm 9a6a02a235f1
   src/mem/protocol/RubySlicc_Types.sm 9a6a02a235f1
   src/mem/ruby/profiler/AccessTraceForAddress.hh 9a6a02a235f1
   src/mem/ruby/profiler/AccessTraceForAddress.cc 9a6a02a235f1
   src/mem/ruby/profiler/AddressProfiler.hh 9a6a02a235f1
   src/mem/ruby/profiler/AddressProfiler.cc 9a6a02a235f1
   src/mem/ruby/profiler/CacheProfiler.hh 9a6a02a235f1
   src/mem/ruby/profiler/CacheProfiler.cc 9a6a02a235f1
   src/mem/ruby/profiler/Profiler.hh 9a6a02a235f1
   src/mem/ruby/slicc_interface/RubyRequest.hh 9a6a02a235f1
   src/mem/ruby/system/CacheMemory.hh 9a6a02a235f1
   src/mem/ruby/system/CacheMemory.cc 9a6a02a235f1
   src/mem/ruby/system/Sequencer.hh 9a6a02a235f1
   src/mem/ruby/system/Sequencer.cc 9a6a02a235f1
 
 Diff: http://reviews.m5sim.org/r/601/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Nilay
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Ruby: Convert AccessModeType to RubyAccessMode

2011-03-22 Thread Beckmann, Brad
Nevermind, I understand the reason now.  This looks good to me.

Thanks,

Brad

 -Original Message-
 From: Beckmann, Brad
 Sent: Saturday, March 19, 2011 1:50 PM
 To: 'M5 Developer List'
 Subject: RE: [m5-dev] Review Request: Ruby: Convert AccessModeType to
 RubyAccessMode
 
 Hi Nilay,
 
 Why do you want to change the name?  Both names seem equivalent to me.
 
 Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
  Behalf Of Nilay Vaish
  Sent: Friday, March 18, 2011 9:55 PM
  To: Nilay Vaish; Default
  Subject: [m5-dev] Review Request: Ruby: Convert AccessModeType to
  RubyAccessMode
 
 
  ---
  This is an automatically generated e-mail. To reply, visit:
  http://reviews.m5sim.org/r/601/
  ---
 
  Review request for Default.
 
 
  Summary
  ---
 
  Ruby: Convert AccessModeType to RubyAccessMode
  This patch converts AccessModeType to RubyAccessMode so that both the
  protocol dependent and independent code uses the same access mode.
 
 
  Diffs
  -
 
src/cpu/testers/rubytest/Check.hh 9a6a02a235f1
src/cpu/testers/rubytest/Check.cc 9a6a02a235f1
src/mem/protocol/MESI_CMP_directory-msg.sm 9a6a02a235f1
src/mem/protocol/MOESI_CMP_directory-msg.sm 9a6a02a235f1
src/mem/protocol/MOESI_CMP_token-L1cache.sm 9a6a02a235f1
src/mem/protocol/MOESI_CMP_token-dir.sm 9a6a02a235f1
src/mem/protocol/MOESI_CMP_token-msg.sm 9a6a02a235f1
src/mem/protocol/RubySlicc_Exports.sm 9a6a02a235f1
src/mem/protocol/RubySlicc_Types.sm 9a6a02a235f1
src/mem/ruby/profiler/AccessTraceForAddress.hh 9a6a02a235f1
src/mem/ruby/profiler/AccessTraceForAddress.cc 9a6a02a235f1
src/mem/ruby/profiler/AddressProfiler.hh 9a6a02a235f1
src/mem/ruby/profiler/AddressProfiler.cc 9a6a02a235f1
src/mem/ruby/profiler/CacheProfiler.hh 9a6a02a235f1
src/mem/ruby/profiler/CacheProfiler.cc 9a6a02a235f1
src/mem/ruby/profiler/Profiler.hh 9a6a02a235f1
src/mem/ruby/slicc_interface/RubyRequest.hh 9a6a02a235f1
src/mem/ruby/system/CacheMemory.hh 9a6a02a235f1
src/mem/ruby/system/CacheMemory.cc 9a6a02a235f1
src/mem/ruby/system/Sequencer.hh 9a6a02a235f1
src/mem/ruby/system/Sequencer.cc 9a6a02a235f1
 
  Diff: http://reviews.m5sim.org/r/601/diff
 
 
  Testing
  ---
 
 
  Thanks,
 
  Nilay
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Ruby FS - DMA Controller problem?

2011-03-22 Thread Beckmann, Brad
Korey, if you're deadlock is with running the MOESI_CMP_directory protocol, I'm 
not surprised.  DMA support is pretty much broken in that protocol.  I have 
that fixed and I also fixed the underlining DMA problem.  I'll be pushing the 
fixes momentarily.

Korey and Malek, please pull these changes and confirm they fix your problem.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Korey Sewell
 Sent: Friday, March 18, 2011 9:12 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
 
 message below
 
 Why did it work before the block size patch?
  - When the ChuckGenerator sees the block size is 0, it doesn't split
  up the request into multiple patches and sends the whole dma request
  at once.  That is fine because the DMASequencer splits the request
  into multiple requests and only responds to the dma port when the entire
 request is complete.
 
 With regards to the old changeset that boots with the block size = 0, I was 
 not
 able to boot a large scale CMP system (more than 16 cores) due to the
 deadlock threshold being triggered.
 
 I'm assuming that Brad has a read on how to fix that problem so I'll probably
 start working on what is causing that deadlock so hopefully we can kind of
 pipeline the bug fixes.
 
 --
 - Korey
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Ruby FS - DMA Controller problem?

2011-03-22 Thread Beckmann, Brad
Nevermind those.  I had several incoming and outgoing emails from the weekend 
that finally got through our system.

Brad

  

 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay
 Sent: Tuesday, March 22, 2011 8:07 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
 
 On Sat, March 19, 2011 6:01 pm, Beckmann, Brad wrote:
  Korey, if you're deadlock is with running the MOESI_CMP_directory
  protocol, I'm not surprised.  DMA support is pretty much broken in
 that
  protocol.  I have that fixed and I also fixed the underlining DMA
 problem.
   I'll be pushing the fixes momentarily.
 
  Korey and Malek, please pull these changes and confirm they fix your
  problem.
 
  Brad
 
 
 
 Brad, how come the mails you sent on Saturday are being received now?
 
 
 --
 Nilay
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Ruby FS - DMA Controller problem?

2011-03-17 Thread Beckmann, Brad
Hi Malek/Korey,

The good news is that I've been able to dedicate a significant amount of time 
to this over the past  day or so and I've got a good handle on what is going on 
here.  

Why did it work before the block size patch?
- When the ChuckGenerator sees the block size is 0, it doesn't split up the 
request into multiple patches and sends the whole dma request at once.  That is 
fine because the DMASequencer splits the request into multiple requests and 
only responds to the dma port when the entire request is complete.

What is the current problem?
- When the ChuckGenerator sees the block size of 64, the dma port splits the 
request into 64-byte packets, effectively doing the same thing the dma 
sequencer does.  That in itself shouldn't break things...The DMA sequencer 
nacks all but the first 64-byte request of the dma transfer because it is 
designed to only handle one M5 packet at a time.  Eventually the first 64-byte 
packet completes and the RubyPort tells the dma port to retry the second 
packet.  The dma port does, but for some reason DMASequencer still nacks that 
second request.  I'm not quite sure why that is, but I'm sure I'll figure it 
out soon.  Once I do, I'll push a fix along with all the other fixes I've come 
across along this multi-day adventure.

Brad

 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Korey Sewell
 Sent: Thursday, March 17, 2011 3:10 PM
 To: Malek Musleh
 Cc: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
 
 Hi Malek,
 Can you send your most recent trace showing what you described (if it isnt
 too big)? I havent observed the different request size errors, but I think I
 have observed the different PRD addresses on the first access (in the most
 recent changeset). I'll double check.
 
 I was planning to post sometime soon what was the latest on my debugging
 efforts but a quick summary is that the PRD address gets set from a
 BMI.DTP register that eventually gets propagate through. I havent been
 able to verify if that is loaded from the kernel or some configuration
 parameter quite yet.
 
 
 I have a feeling it might be also linked with the timing simpleCpu
  changes about handling split requests, although Alpha does not support
  split requests, that is independent of the DMA transfers.
 
 Are you sure it's a split request problem and not an uncacheable address
 thing? Or maybe it's some combo of both?
 
 
 
  Also, comparing Ruby Traces (with and without failing changeset) the
  first PRD BaseAddr is consistent between them, but not consistent
  between Ruby/M5. So the fact that the PRD BaseAddr is 'wrong' in the
  one case does not prevent it from booting the Kernel.
 
 That's an interesting observation. It would be nice to figure out why that
 address may or may not matter though.
 
 
 
 
  Not really sure if that helps anymore.
 
  Malek
 
  On Tue, Mar 15, 2011 at 6:50 PM, Korey Sewell ksew...@umich.edu
 wrote:
   Sorry for the confusion, I definitely garbled up some terminology.
  
   I meant that the M5 ran with the atomic model to compare with the
   timing Ruby model.
  
   M5-atomic maybe runs in 10-15 mins and then Ruby 20-30 mins.
  
   I am able to get the problem point in the Ruby simulation (bad DMA
  access)
   in about 20 mins.
  
   I able to get to that same problem point in the M5-atomic mode in
   about
  10
   mins so as to see what to compare against and what values are being
   set/unset incorrectly.
  
  
  
   On Tue, Mar 15, 2011 at 6:22 PM, Beckmann, Brad
  brad.beckm...@amd.com
  wrote:
  
   I'm confused.
  
   Korey, I thought this DMA problem only existed with Ruby?  If so,
   how
  were
   you able to reproduce it using atomic mode?  Ruby does not work
   with the atomic cpu model.
  
   Please clarify, thanks!
  
   Brad
  
-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org]
On Behalf Of Korey Sewell
Sent: Tuesday, March 15, 2011 12:09 PM
To: M5 Developer List
Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
   
Hi Brad/Malek,
I've been able to regenerate this error  in about 20mins now
(instead
  of
hours) by running things in atomic mode. Not sure if that helps
or
  not...
   
On Tue, Mar 15, 2011 at 3:03 PM, Beckmann, Brad
brad.beckm...@amd.comwrote:
   
  How is that you are able to run the memtester in FS Mode?
  I see the ruby_mem_tester.py in /configs/example/ but it
  seems
  that
  it is only configured for SE Mode as far as Ruby is concerned?

 I don't run it in FS mode.  Since the DMA bug manifests only
 after hours of execution, I wanted to first verify that the DMA
 protocol support was solid using the mem tester.  Somewhat
 surprisingly, I found several bugs in MOESI_CMP_directory's
 support of DMA.  It
  turns
 out that the initial DMA support in that protocol wasn't very
 well

Re: [m5-dev] Ruby FS - DMA Controller problem?

2011-03-15 Thread Beckmann, Brad
 How is that you are able to run the memtester in FS Mode?
 I see the ruby_mem_tester.py in /configs/example/ but it seems that it is
 only configured for SE Mode as far as Ruby is concerned?

I don't run it in FS mode.  Since the DMA bug manifests only after hours of 
execution, I wanted to first verify that the DMA protocol support was solid 
using the mem tester.  Somewhat surprisingly, I found several bugs in 
MOESI_CMP_directory's support of DMA.  It turns out that the initial DMA 
support in that protocol wasn't very well thought out.  Now I fixed those bugs, 
but since the DMA problem also arises with the MOESI_hammer protocol, I'm 
confident that my patches don't fix the real problem.

Brad

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Ruby FS - DMA Controller problem?

2011-03-15 Thread Beckmann, Brad
I'm confused.

Korey, I thought this DMA problem only existed with Ruby?  If so, how were you 
able to reproduce it using atomic mode?  Ruby does not work with the atomic cpu 
model.

Please clarify, thanks!

Brad

 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Korey Sewell
 Sent: Tuesday, March 15, 2011 12:09 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
 
 Hi Brad/Malek,
 I've been able to regenerate this error  in about 20mins now (instead of
 hours) by running things in atomic mode. Not sure if that helps or not...
 
 On Tue, Mar 15, 2011 at 3:03 PM, Beckmann, Brad
 brad.beckm...@amd.comwrote:
 
   How is that you are able to run the memtester in FS Mode?
   I see the ruby_mem_tester.py in /configs/example/ but it seems that
   it is only configured for SE Mode as far as Ruby is concerned?
 
  I don't run it in FS mode.  Since the DMA bug manifests only after
  hours of execution, I wanted to first verify that the DMA protocol
  support was solid using the mem tester.  Somewhat surprisingly, I
  found several bugs in MOESI_CMP_directory's support of DMA.  It turns
  out that the initial DMA support in that protocol wasn't very well
  thought out.  Now I fixed those bugs, but since the DMA problem also
  arises with the MOESI_hammer protocol, I'm confident that my patches
 don't fix the real problem.
 
  Brad
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
 
 
 --
 - Korey
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Ruby FS - DMA Controller problem?

2011-03-14 Thread Beckmann, Brad
Thanks Malek.  Very interesting.

Yes, this 5 line changeset seems rather benign, but actually has huge 
ramifications.  With this change, the RubyPort passes the correct block size to 
the cpu/device models.  Without it, I believe the block size defaults to 0 or 
1...I can't remember which.  While that seems rather inconsequential, I noticed 
when I made this change that the memtester behaved quite differently.  In 
particular, it keeps issuing requests until sendTiming returns false, instead 
of just one request/cpu at a time.  Therefore another patch in this series 
added the retry mechanism to the RubyPort.  I'm still not sure exactly what the 
problem is with ruby+dma, but I suspect that the dma devices are behaving 
differently now that the RubyPort passes the correct block size.

I was able to spend a few hours on this over the weekend.  I am now able to 
reproduce the error and I have a few protocol bug fixes queued up.  However, I 
don't think those fixes actually solved the main issue.  I don't think I'll be 
able to get to it today, but I'll try to find some time tomorrow to investigate 
further.  

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Korey Sewell
 Sent: Monday, March 14, 2011 2:10 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
 
 Which lines are you commenting out to  get it to work? It's a bit unclear in 
 the
 diff you point to (maybe because you said it's a full set of changes, not just
 one)
 
 (btw: The work I've been doing is comparing the old m5 memory trace to
 the gem5 memory trace to try to chase down the bug. I wouldn't be
 surprised if we are converging to the same bug though.)
 
 On Mon, Mar 14, 2011 at 3:51 AM, Malek Musleh
 malek.mus...@gmail.com wrote:
  Hi Brad,
 
  I found the problem that was causing this error. Specifically, it is
  this changeset:
 
  changeset:   7909:eee578ed2130
  user:        Joel Hestness hestn...@cs.utexas.edu
  date:        Sun Feb 06 22:14:18 2011 -0800
  summary:     Ruby: Fix to return cache block size to CPU for split
  data transfers
 
  Link: http://reviews.m5sim.org/r/393/diff/#index_header
 
  Previously, I mentioned it was a couple of changesets prior to this
  one, but the changes between them are related, so it wasn't as obvious
  what was happening.
 
  In fact, this corresponds to the assert() for the block size you had
  put in to deal with x86 unaligned accesses, but then later removed
  because of LL/SC in Alpha.
 
  It's not clear to me why this is causing a problem, or rather why this
  doesn't return the default 64 byte block size from the ruby system,
  but commenting out those lines of code allowed it to work.
 
  Maybe Korey could confirm?
 
  Malek
 
  On Wed, Mar 9, 2011 at 8:24 PM, Beckmann, Brad
 brad.beckm...@amd.com wrote:
  I still have not been able to reproduce the problem, but I haven't tried 
  in a
 few weeks.  So does this happen when booting up the system, independent
 of what benchmark you are running?  If so, could you send me your
 command line?  I'm sure the disk image and kernel binaries between us are
 different, so I don't necessarily think I'll be able to reproduce your 
 problem,
 but at least I'll be able to isolate it.
 
  Brad
 
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Malek Musleh
  Sent: Wednesday, March 09, 2011 4:41 PM
  To: M5 Developer List
  Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
 
  Hi Korey,
 
  I ran into a similar problem with a different benchmark/boot up attempt.
  There is another thread on m5-dev with 'Ruby FS failing with recent
  changesets' as the subject. I was able to track down the changeset
  which it was coming from, but I did not look further into the
  changeset as to why it was causing it.
 
  Brad said he would take a look at it, but I am not sure if he was
  able to reproduce the problem.
 
  Malek
 
  On Wed, Mar 9, 2011 at 7:08 PM, Korey Sewell ksew...@umich.edu
 wrote:
   Hi all,
   I'm trying to run Ruby in FS mode for the FFT benchmark.
  
   However, I've been unable to fully boot the kernel and error with
   a panic in the IDE disk controller:
   panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1
   @ cycle 62640732569001
  
 [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc,
   line 323]
  
   Has anybody run into a similar error or does anyone have any
   suggestions for debugging the problem? I can run the same code
   using the M5 memory system and FFT finishes properly so it's
   definitely a ruby-specific thing. It seems to track this down , I
   could diff instruction traces (M5 v. Ruby) or maybe even diff
   trace output from the IdeDisk trace flags but those routes seem a
   bit heavy-handed
  considering the amount of trace output generated.
  
   The command line this was run with is:
   build/ALPHA_FS_MOESI_CMP_directory

Re: [m5-dev] Ruby FS - DMA Controller problem?

2011-03-14 Thread Beckmann, Brad
Hi Malek,

Just to reiterate, I don't think my patches will fix the underlining problem.  
Instead, my patches just fix various corner cases in the protocols.  I suspect 
these corner cases are never actually reached in real execution.

The fact that your dma traces point out that the Ruby and Classic 
configurations use different base addresses makes me think this might be a 
problem with configuration and device registration.  We should investigate 
further.

Brad


 -Original Message-
 From: Malek Musleh [mailto:malek.mus...@gmail.com]
 Sent: Monday, March 14, 2011 9:11 AM
 To: M5 Developer List
 Cc: Beckmann, Brad
 Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
 
 Hi Korey/Brad,
 
 I commented out the following lines:
 
 In RubyPort.hh
 
  unsigned deviceBlockSize() const;
 
 In RubyPort.cc
 
 unsigned
 RubyPort::M5Port::deviceBlockSize() const {
 return (unsigned) RubySystem::getBlockSizeBytes(); }
 
 I also did a diff trace between M5 and Ruby using the IdeDisk traceflag as
 indicated earlier on.
 
 In the Ruby Trace, it stalls at this
 
 2398589225000: system.disk0: Write to disk at offset: 0x1 data 0
 239858940: system.disk0: Write to disk at offset: 0x2 data 0x10
 2398589575000: system.disk0: Write to disk at offset: 0x3 data 0
 2398589742000: system.disk0: Write to disk at offset: 0x4 data 0
 2398589909000: system.disk0: Write to disk at offset: 0x5 data 0
 2398590088000: system.disk0: Write to disk at offset: 0x6 data 0xe0
 2398596763500: system.disk0: Write to disk at offset: 0x7 data 0xc8
 2398597916500: system.disk0: PRD: baseAddr:0x87298000 (0x7298000)
 byteCount:8192 (16) eot:0x8000 sector:0
 2398597916500: system.disk0: doDmaWrite, diskDelay: 100
 totalDiskDelay: 116
 
 Waiting for the Interrupt to be Posted.
 
 However, a comparison between the M5 and Ruby traces suggest that they
 differ on the following line:
 
 RubyTrace:
 
 239858940: system.disk0: Write to disk at offset: 0x2 data 0x10
 2398589575000: system.disk0: Write to disk at offset: 0x3 data 0
 2398589742000: system.disk0: Write to disk at offset: 0x4 data 0
 2398589909000: system.disk0: Write to disk at offset: 0x5 data 0
 2398590088000: system.disk0: Write to disk at offset: 0x6 data 0xe0
 2398596763500: system.disk0: Write to disk at offset: 0x7 data 0xc8
 2398597916500: system.disk0: PRD: baseAddr:0x87298000 (0x7298000)
 byteCount:8192 (16) eot:0x8000 sector:0
 2398597916500: system.disk0: doDmaWrite, diskDelay: 100
 totalDiskDelay: 116
 
 
 M5 Trace:
 
 2237623634000: system.disk0: Write to disk at offset: 0x7 data 0xc8
 2237624206501: system.disk0: PRD: baseAddr:0x87392000 (0x7392000)
 byteCount:8192
  (16) eot:0x8000 sector:0
 2237624206501: system.disk0: doDmaWrite, diskDelay: 100
 totalDiskDelay: 116
 
 If you note that the PRD:baseAddr it tries to access is different, which I 
 would
 think should be the same right? There is no reason why it should be
 different? The 0 or 1 block size, and the sequential retries are forcing the
 DMA timer to time out the request, and thus fails in the dma inconsistent
 state.
 
 I have attached both sets of traces in case it sheds anymore light on to the
 cause of the problem.
 
 In any case, it might not matter too much now since Brad was able to
 reproduce the problem and has a patch for it, but may be of use for future
 M5 changes.
 
 Malek
 
 On Mon, Mar 14, 2011 at 11:54 AM, Beckmann, Brad
 brad.beckm...@amd.com wrote:
  Thanks Malek.  Very interesting.
 
  Yes, this 5 line changeset seems rather benign, but actually has huge
 ramifications.  With this change, the RubyPort passes the correct block size 
 to
 the cpu/device models.  Without it, I believe the block size defaults to 0 or
 1...I can't remember which.  While that seems rather inconsequential, I
 noticed when I made this change that the memtester behaved quite
 differently.  In particular, it keeps issuing requests until sendTiming 
 returns
 false, instead of just one request/cpu at a time.  Therefore another patch in
 this series added the retry mechanism to the RubyPort.  I'm still not sure
 exactly what the problem is with ruby+dma, but I suspect that the dma
 devices are behaving differently now that the RubyPort passes the correct
 block size.
 
  I was able to spend a few hours on this over the weekend.  I am now able
 to reproduce the error and I have a few protocol bug fixes queued
 up.  However, I don't think those fixes actually solved the main issue.  I 
 don't
 think I'll be able to get to it today, but I'll try to find some time 
 tomorrow to
 investigate further.
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Korey Sewell
  Sent: Monday, March 14, 2011 2:10 AM
  To: M5 Developer List
  Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
 
  Which lines are you commenting out to  get it to work? It's a bit
  unclear in the diff you point to (maybe because you said it's

Re: [m5-dev] Functional Interface in Ruby

2011-03-13 Thread Beckmann, Brad
You probably already realize this, but I want to point out that the topology 
needs pointers to all the controllers.  I don't have the code in front of me, 
but if I recall correctly, topology is then a member of the network.  If you 
move the controllers underneath RubySystem and if RubySystem keeps its pointer 
to the network, then a cycle exists.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Steve Reinhardt
 Sent: Saturday, March 12, 2011 9:32 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Functional Interface in Ruby
 
 On Sat, Mar 12, 2011 at 5:45 PM, Nilay Vaish ni...@cs.wisc.edu wrote:
 
  On Sat, 12 Mar 2011, Steve Reinhardt wrote:
 
  Can't we loop through the directory controllers in python to
 calculate the
  total size, then pass that size as a parameter to RubySystem?
 There's no
  reason for the C++ RubySystem object to need the directory
 controller
  pointers just to do that calculation.
 
 
  It is being done in Python script. We were thinking of passing
 RubySystem
  object to the Network. But RubySystem cannot be created before
 directory
  controllers are created. And the reason for these changes is to pass
  RubySystem object to the controllers.
 
 
 
 I'm still confused... the python objects can be created in any order,
 and
 parameter values can be set at any time and in any order, up until the
 instantiate() call.  The acyclic dependency issue only affects the
 creation
 of C++ objects in instantiate().  So I don't see how this is relevant.
 
 
  I would like to access cache controllers from RubySystem parameter
 object
  in C++. If we do allow such access, then we would not have any cycle
 in the
  graph. We only need to create controllers, then network and then
 RubySystem
  in Python. If controllers are visible to RubySystem as members of the
  RubySystem parameter object, then we can create the list of cache
 memories
  by probing each controller object.
 
 
 Yea, I can see that even though that's not the m5 idiom, and is a
 little
 less convenient since the python code has to explicitly build this list
 instead of having it happen implicitly, that it fits better with the
 way
 RubySystem is currently built up.
 
 Steve
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Functional Interface in Ruby

2011-03-11 Thread Beckmann, Brad

 In the short run, I think the easiest way to break the cycle is to have the
 network take the RubySystem object as a parameter instead of the other
 way around, then add a registerNetwork() callback on RubySystem to let the
 network give the system its pointer.
 

...

 Finally, it occurs to me that we avoid these issues to some extent in the
 classic m5 memory hierarchy by using ports rather than parameters to set up
 inter-object connections; maybe we should consider extending or adapting
 that model to Ruby someday.
 

I think Steve's short-term solution is a good one.  However I'm not sure if 
Ruby always using ports would solve thr problem.  The connections that Nilay is 
trying to set up, a system-level list of all caches and memory objects, are not 
real.  It is completely fake.  I'm not sure that ports really fit that model.  
Instead, it seems like the crux of the problem is that we want to set up this 
this list in C++ because it doesn't make sense to explicitly set up these 
connections in the python file.  I'm not sure if there is a perfect solution.

Brad


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Ruby: Get rid of the dead ruby tester.

2011-03-10 Thread Beckmann, Brad
Gabe, thanks for putting this patch out for review.  I had forgotten that this 
directory still exists.  I moved the code that I'm most familiar with out of 
this directory last year, but I didn't touch the Racey tester code because I 
wasn't sure what to do with it.  I believe that code was written by Min Xu 
several years ago to test his flight data recorder.  Subsequently we used to 
use it for general testing because it tended to find certain bugs much faster 
than the standard random tester.  That being said, I suspect that code hasn't 
been used in 5+ years and at some point we need to have a timeout and just 
delete it.  Unless the folks at Wisconsin prefer otherwise, I'm completely fine 
with deleting the whole directory.

Regardless, the DeterministicDriver files should definitely be deleted.  That 
functionality now exists in the directedtest directory.  I should have deleted 
them in my changeset from last year.  

By the way, this reminds me that the directed test code is another piece that 
should be added to the regression tester.  I'll add that to my list.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Steve Reinhardt
 Sent: Thursday, March 10, 2011 11:10 AM
 To: Gabe Black
 Cc: Default; Ali Saidi
 Subject: Re: [m5-dev] Review Request: Ruby: Get rid of the dead ruby
 tester.
 
 I don't think it's dead, just sleeping... I'm not sure why it's not compilable
 right now (I thought it was usable), but I'd rather just fix that up than 
 whack
 the code.  We definitely need some input from Brad or the Wisconsin folks
 before making this change.
 
 Steve
 
 On Thu, Mar 10, 2011 at 11:03 AM, Gabe Black gbl...@eecs.umich.edu
 wrote:
 
 This is an automatically generated e-mail. To reply, visit:
  http://reviews.m5sim.org/r/555/
Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt,
  and Nathan Binkert.
  By Gabe Black.
  Description
 
  Ruby: Get rid of the dead ruby tester.
 
  None of the code in the ruby tester directory is compiled or referred
  to outside of that directory. This change eliminates it. If it's
  needed in the future, it can be revived from the history. In the mean
  time, this removes clutter and the only use of the GEMS_ROOT scons
 variable.
 
Diffs
 
 - src/mem/ruby/tester/DeterministicDriver.hh (77aa0f94e7f2)
 - src/mem/ruby/tester/DeterministicDriver.cc (77aa0f94e7f2)
 - src/mem/ruby/tester/RaceyDriver.hh (77aa0f94e7f2)
 - src/mem/ruby/tester/RaceyDriver.cc (77aa0f94e7f2)
 - src/mem/ruby/tester/RaceyPseudoThread.hh (77aa0f94e7f2)
 - src/mem/ruby/tester/RaceyPseudoThread.cc (77aa0f94e7f2)
 - src/mem/ruby/tester/SConscript (77aa0f94e7f2)
 - src/mem/ruby/tester/SpecifiedGenerator.hh (77aa0f94e7f2)
 - src/mem/ruby/tester/SpecifiedGenerator.cc (77aa0f94e7f2)
 - src/mem/ruby/tester/Tester_Globals.hh (77aa0f94e7f2)
 - src/mem/ruby/tester/main.hh (77aa0f94e7f2)
 - src/mem/ruby/tester/main.cc (77aa0f94e7f2)
 - src/mem/ruby/tester/test_framework.hh (77aa0f94e7f2)
 - src/mem/ruby/tester/test_framework.cc (77aa0f94e7f2)
 
  View Diff http://reviews.m5sim.org/r/555/diff/
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Ruby FS - DMA Controller problem?

2011-03-09 Thread Beckmann, Brad
I still have not been able to reproduce the problem, but I haven't tried in a 
few weeks.  So does this happen when booting up the system, independent of what 
benchmark you are running?  If so, could you send me your command line?  I'm 
sure the disk image and kernel binaries between us are different, so I don't 
necessarily think I'll be able to reproduce your problem, but at least I'll be 
able to isolate it.

Brad



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Malek Musleh
 Sent: Wednesday, March 09, 2011 4:41 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS - DMA Controller problem?
 
 Hi Korey,
 
 I ran into a similar problem with a different benchmark/boot up attempt.
 There is another thread on m5-dev with 'Ruby FS failing with recent
 changesets' as the subject. I was able to track down the changeset which it
 was coming from, but I did not look further into the changeset as to why it
 was causing it.
 
 Brad said he would take a look at it, but I am not sure if he was able to
 reproduce the problem.
 
 Malek
 
 On Wed, Mar 9, 2011 at 7:08 PM, Korey Sewell ksew...@umich.edu wrote:
  Hi all,
  I'm trying to run Ruby in FS mode for the FFT benchmark.
 
  However, I've been unable to fully boot the kernel and error with a
  panic in the IDE disk controller:
  panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1 @
  cycle 62640732569001
  [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc,
  line 323]
 
  Has anybody run into a similar error or does anyone have any
  suggestions for debugging the problem? I can run the same code using
  the M5 memory system and FFT finishes properly so it's definitely a
  ruby-specific thing. It seems to track this down , I could diff
  instruction traces (M5 v. Ruby) or maybe even diff trace output from
  the IdeDisk trace flags but those routes seem a bit heavy-handed
 considering the amount of trace output generated.
 
  The command line this was run with is:
  build/ALPHA_FS_MOESI_CMP_directory/m5.opt
 configs/example/ruby_fs.py
  -b fft_64t_base -n 1
 
  The output in system.terminal is:
  hda: M5 IDE Disk, ATA DISK drive
  hdb: M5 IDE Disk, ATA DISK drive
  hda: UDMA/33 mode selected
  hdb: UDMA/33 mode selected
  hdc: M5 IDE Disk, ATA DISK drive
  hdc: UDMA/33 mode selected
  ide0 at 0x8410-0x8417,0x8422 on irq 31
  ide1 at 0x8418-0x841f,0x8426 on irq 31
  ide_generic: please use probe_mask=0x3f module parameter for probing
  all legacy ISA IDE ports
  ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
  ide3 at 0x170-0x177,0x376 on irq 15
  hda: max request size: 128KiB
  hda: 2866752 sectors (1467 MB), CHS=2844/16/63
   hda:4hda: dma_timer_expiry: dma status == 0x65
  hda: DMA interrupt recovery
  hda: lost interrupt
   unknown partition table
  hdb: max request size: 128KiB
  hdb: 1008000 sectors (516 MB), CHS=1000/16/63
   hdb:4hdb: dma_timer_expiry: dma status == 0x65
  hdb: DMA interrupt recovery
  hdb: lost interrupt
 
  Thanks again, any help or thoughts would be well appreciated.
 
  --
  - Korey
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Functional Interface in Ruby

2011-03-09 Thread Beckmann, Brad
I believe the L1DcacheMemory is created right after system because inside each 
protocol file the first thing attached to the system is the l1 controllers.  
That way the controllers get a more descriptive name than what they are as 
related to the topology.

I'm still a little confused by the cycle error.  If the parent.any call 
searches the graph for the close object of that particular type, wouldn't you 
always get a cycle using parent.any?  Or are other uses of parent.any more of 
an uncle search than a true parent search?

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Steve Reinhardt
 Sent: Wednesday, March 09, 2011 5:22 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Functional Interface in Ruby
 
 It seems odd that it tries to create L1DcacheMemory right after it creates
 system.  Can you add print statements like in this patch and see what it
 shows?
 
 diff --git a/src/python/m5/SimObject.py b/src/python/m5/SimObject.py
 --- a/src/python/m5/SimObject.py
 +++ b/src/python/m5/SimObject.py
 @@ -843,8 +843,11 @@
 
  # Call C++ to create C++ object corresponding to this object
  def createCCObject(self):
 +print Creating, self, params
  self.getCCParams()
 +print Creating, self
  self.getCCObject() # force creation
 +print Done creating, self
 
  def getValue(self):
  return self.getCCObject()
 
 
 On Wed, Mar 9, 2011 at 2:34 PM, Nilay Vaish ni...@cs.wisc.edu wrote:
 
  Creating root
  Creating system.physmem
  Creating system
  Creating system.l1_cntrl0.L1DcacheMemory Creating system.ruby Creating
  system.ruby.network Creating system.ruby.network.topology Creating
  system.ruby.network.topology.ext_links0
  Creating system.l1_cntrl0
  Creating system.l1_cntrl0.L1DcacheMemory
 
  This is the output I obtained from SimObject.py, clearly there is a cycle.
  Should not the cache controllers be part of ruby, instead of being
  part of system? Once they become part of ruby, it should be possible
  to traverse the controller array and figure out all the caches.
 
 
  Nilay
 
  On Wed, 9 Mar 2011, Steve Reinhardt wrote:
 
   I think you're looking in the wrong place... you want to look at
  getCCObject() in src/python/m5/SimObject.py where the error message
  is coming from, and see if you can add some print statements there.
 
  Steve
 
  On Wed, Mar 9, 2011 at 11:27 AM, Nilay Vaish ni...@cs.wisc.edu wrote:
 
   What exactly happens on the function call
  Param.RubySystem(Parent.any,
  Ruby System) ?
 
  Nilay
 
 
  On Wed, 9 Mar 2011, Steve Reinhardt wrote:
 
   Does the RubySystem object have a pointer to a RubyCache object?
 
 
  You could also go into the python code and add some print
  statements to get a clue about where the cycle is occurring.
 
  Steve
 
  On Wed, Mar 9, 2011 at 4:51 AM, Nilay ni...@cs.wisc.edu wrote:
 
   Brad, given current versions of MESI_CMP_directory.py and Ruby.py,
  the
 
  following change to the way cache memory is added to the system
  creates a loop. What am I missing here?
 
  class RubyAbstractMemory(SimObject):
   type = 'RubyAbstractMemory'
   cxx_class = 'AbstractMemory'
   system = Param.RubySystem(Parent.any,Ruby System);
 
  class RubyCache(RubyAbstractMemory):
   type = 'RubyCache'
   cxx_class = 'CacheMemory'
   size = Param.MemorySize(capacity in bytes);  latency =
  Param.Int();  assoc = Param.Int();  replacement_policy =
  Param.String(PSEUDO_LRU, );  start_index_bit = Param.Int(6,
  index start, default 6 for 64-byte line);
 
  --
  Nilay
 
   ___
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Functional Interface in Ruby

2011-03-08 Thread Beckmann, Brad
Great.  It sounds like we are thinking of a similar solution.  Just one thing I 
want to point out is AbstractController may not be the right place to build the 
list.  As you know, sometimes a controller may manage multiple cachememory 
objects and other controllers may not manage any cachememory or directorymemory 
objects.  Instead, you may want to consider creating a separate RubyStorage 
class that builds the list from which both CacheMemory and DirectoryMemory 
inherit.  I'll leave it up to you to decide which is easier.  

Also we don't want to further inhibit ourselves from creating multiple Ruby 
systems in the same simulation.  (I understand there may be other issues that 
currently prevent us from doing that.)  Therefore, instead of using a static 
function, we can build the list on a per RubySystem basis.  The cachememory and 
directorymemory objects should be able to get a pointer to their associated 
RubySystem using the Parent.any directive in their .py file.  See the 
following line in sim/System.py for an example 'physmem = 
Param.PhysicalMemory(Parent.any, physical memory)'.

Brad


 -Original Message-
 From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
 Sent: Tuesday, March 08, 2011 3:22 AM
 To: Beckmann, Brad
 Cc: m5-dev@m5sim.org
 Subject: RE: Functional Interface in Ruby
 
 It seems that this will work out. We can make AbstractController call a static
 function of RubyPort class that will add the calling object to some list which
 will be accessed while making functional accesses. As far as pushing
 functional access support to Sequencer in to concerned, there was no
 particular reason for that. Since Sequencer handles that timing acesses, I
 thought that should be the file that would contain the code for functional
 accesses. I am fine with functional access code going in to RubyPort.
 
 --
 Nilay
 
 
 On Mon, 7 Mar 2011, Beckmann, Brad wrote:
 
  Hi Nilay,
 
  Please excuse the slow response.  I've been meaning to reply to this email
 for a few days.
 
  Absolutely, we will need to maintain some sort of list of all
  cachememory and directorymemory objects to make the functional access
  support work.  However, I'm not sure if we'll need to modify the
  protocol python files.  Instead, could we create a list of these
  objects through their c++ constructors similar to how the SimObject
  list is created?  Also, I know the line between the RubyPort and
  Sequencer is quite blurry, but is there a particular reason to push
  the functional access support into the Sequencer?  It seems that the
  RubyPort would be a more natural location.
 
  Brad
 
 
  -Original Message-
  From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
  Sent: Friday, March 04, 2011 9:49 AM
  To: Beckmann, Brad
  Cc: m5-dev@m5sim.org
  Subject: Functional Interface in Ruby
 
  I have been thinking about how to make Ruby support functional
 accesses.
  It seems some where we will have to add support so that either
  RubyPort or Sequencer can view all other caches. I am currently
  leaning towards adding it to the sequencer. I think this can be done
  by editing protocol files in configs/ruby. And then RubyPort can pass
  on functional accesses to the Sequencer, which will look up all the caches
 and take the correct action.
 
  I think this can be made to work.
 
  Nilay
 
 
 


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Functional Interface in Ruby

2011-03-08 Thread Beckmann, Brad
Hi Nilay,

It looks like my email filter of the m5-dev list cause me to basically send you 
the same suggestion that Steve sent you.  Sorry for the confusion, but it is 
good to know that Steve and I at least are considering the same problem.  From 
now on, let's drop our individual email addresses and just direct our responses 
to m5-dev.

Brad


From: Steve Reinhardt [mailto:ste...@gmail.com]
Sent: Tuesday, March 08, 2011 7:18 AM
To: M5 Developer List
Cc: Nilay Vaish; Beckmann, Brad
Subject: Re: [m5-dev] Functional Interface in Ruby

Forgot to mention that this is how we handle registering all the thread 
contexts within a system... you can look at that code (in the CPU models and in 
System) for an example.
On Tue, Mar 8, 2011 at 7:16 AM, Steve Reinhardt 
ste...@gmail.commailto:ste...@gmail.com wrote:
Sorry I missed this thread... I just read Nilay's response about python issues 
and he pointed me over here.

One thing we should think about is that we really only want the caches within a 
single system to be flushed at once... I know that it's unlikely that anyone 
will want to model two systems with detailed memory models at once, and I 
vaguely recall there were other issues with Ruby not really supporting multiple 
instances of itself, but I don't want to see us make things less modular than 
they already are.

The m5 idiom for doing this is:
- add a parameter to each cache/controller/whatever we want to track like this:
 system = Param.System(Parent.any, system object)
- add a method to the System object like registerCache(Cache *c) that adds c to 
the system object's list of caches
- Have each cache constructor call p-system-registerCache(this) to register 
itself

Would something like this work for what you're trying to do?

Steve

On Tue, Mar 8, 2011 at 3:21 AM, Nilay Vaish 
ni...@cs.wisc.edumailto:ni...@cs.wisc.edu wrote:
It seems that this will work out. We can make AbstractController call a static 
function of RubyPort class that will add the calling object to some list which 
will be accessed while making functional accesses. As far as pushing functional 
access support to Sequencer in to concerned, there was no particular reason for 
that. Since Sequencer handles that timing acesses, I thought that should be the 
file that would contain the code for functional accesses. I am fine with 
functional access code going in to RubyPort.

--
Nilay



On Mon, 7 Mar 2011, Beckmann, Brad wrote:
Hi Nilay,

Please excuse the slow response.  I've been meaning to reply to this email for 
a few days.

Absolutely, we will need to maintain some sort of list of all cachememory and 
directorymemory objects to make the functional access support work.  However, 
I'm not sure if we'll need to modify the protocol python files.  Instead, could 
we create a list of these objects through their c++ constructors similar to how 
the SimObject list is created?  Also, I know the line between the RubyPort and 
Sequencer is quite blurry, but is there a particular reason to push the 
functional access support into the Sequencer?  It seems that the RubyPort would 
be a more natural location.

Brad

-Original Message-
From: Nilay Vaish [mailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edu]
Sent: Friday, March 04, 2011 9:49 AM
To: Beckmann, Brad
Cc: m5-dev@m5sim.orgmailto:m5-dev@m5sim.org
Subject: Functional Interface in Ruby

I have been thinking about how to make Ruby support functional accesses.
It seems some where we will have to add support so that either RubyPort or
Sequencer can view all other caches. I am currently leaning towards adding it
to the sequencer. I think this can be done by editing protocol files in
configs/ruby. And then RubyPort can pass on functional accesses to the
Sequencer, which will look up all the caches and take the correct action.

I think this can be made to work.

Nilay


___
m5-dev mailing list
m5-dev@m5sim.orgmailto:m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Functional Interface in Ruby

2011-03-07 Thread Beckmann, Brad
Hi Nilay,

Please excuse the slow response.  I've been meaning to reply to this email for 
a few days.

Absolutely, we will need to maintain some sort of list of all cachememory and 
directorymemory objects to make the functional access support work.  However, 
I'm not sure if we'll need to modify the protocol python files.  Instead, could 
we create a list of these objects through their c++ constructors similar to how 
the SimObject list is created?  Also, I know the line between the RubyPort and 
Sequencer is quite blurry, but is there a particular reason to push the 
functional access support into the Sequencer?  It seems that the RubyPort would 
be a more natural location.

Brad


 -Original Message-
 From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
 Sent: Friday, March 04, 2011 9:49 AM
 To: Beckmann, Brad
 Cc: m5-dev@m5sim.org
 Subject: Functional Interface in Ruby
 
 I have been thinking about how to make Ruby support functional accesses.
 It seems some where we will have to add support so that either RubyPort or
 Sequencer can view all other caches. I am currently leaning towards adding it
 to the sequencer. I think this can be done by editing protocol files in
 configs/ruby. And then RubyPort can pass on functional accesses to the
 Sequencer, which will look up all the caches and take the correct action.
 
 I think this can be made to work.
 
 Nilay


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Testing Functional Access

2011-03-01 Thread Beckmann, Brad
Hi Nilay,

I would suggest a few different tests.  The first one would be to run a simple 
binary under Alpha SE mode using Ruby.  You should first observe a bunch of 
functional accesses that initialize memory and then (if I recall correctly) 
dynamic accesses will load the TLB.  After passing that test, I would try 
loading a SE checkpoint and running.  After that, I would move on to similar 
tests using FS mode. 

I hope that helps.  Please let me know if you have any specific questions.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay
 Sent: Tuesday, March 01, 2011 6:51 AM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] Testing Functional Access
 
 How can I test whether or not functional accesses to the memory are
 working correctly? Do we have some regression test for this?
 
 Thanks
 Nilay
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Testing Functional Access

2011-03-01 Thread Beckmann, Brad
I forgot that the memtester includes functional accesses.  That is a good 
suggestion, especially when it comes to testing the situations where Ruby can't 
satisfy the functional access due to contention with timing accesses.

The memtester does run with Ruby (it actually runs every night in the 
regression tester), however the percentage of functional accesses is currently 
set to zero.  See configs/example/ruby_mem_test.py.  You'll obviously want to 
change that and include code within src/cpu/testers/memtest/* to handle failed 
functional accesses.  If you don't want to initially deal with the failure 
situations, you can set the functional access percentage to 100% and that 
should always work.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Steve Reinhardt
 Sent: Tuesday, March 01, 2011 10:49 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Testing Functional Access
 
 The m5 memtester supports functional accesses (there's a
 percent_functional parameter on the MemTest object).  I don't know if
 anyone's run the memtester with Ruby though.  Seems like it should work.
 
 Steve
 
 On Tue, Mar 1, 2011 at 8:39 AM, Joel Hestness
 hestn...@cs.utexas.eduwrote:
 
  Hi Nilay,
   I don't know if there is a regression for it, but the M5 utility
  (./util/m5/) sets up functional accesses to memory.  For instance, in
  FS, if you specify an rcS script to fs.py and call  % /sbin/m5
  readfile from the command line of the simulated system, it will read
  the specified rcS file off the host machine's disk and send it to the
  memory of the simulated system using functional accesses.  I think
  there are other functional access examples in the magic that the M5
  utility provides.
   Hope this helps,
   Joel
 
 
 
  On Tue, Mar 1, 2011 at 8:51 AM, Nilay ni...@cs.wisc.edu wrote:
 
   How can I test whether or not functional accesses to the memory are
   working correctly? Do we have some regression test for this?
  
   Thanks
   Nilay
  
   ___
   m5-dev mailing list
   m5-dev@m5sim.org
   http://m5sim.org/mailman/listinfo/m5-dev
  
 
 
 
  --
   Joel Hestness
   PhD Student, Computer Architecture
   Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Ruby: Fix DPRINTF bugs in PerfectSwitch and MessageBuffer

2011-03-01 Thread Beckmann, Brad
Hi Nilay,

In the future, feel free to directly check in these sort of minor bug fixes.

Thanks,

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Tuesday, March 01, 2011 1:32 PM
 To: Nilay Vaish; Default
 Subject: [m5-dev] Review Request: Ruby: Fix DPRINTF bugs in PerfectSwitch
 and MessageBuffer
 
 
 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/505/
 ---
 
 Review request for Default.
 
 
 Summary
 ---
 
 At a couple of places in PerfectSwitch.cc and MessageBuffer.cc, DPRINTF()
 has not been provided with correct number of arguments. The patch fixes
 these bugs.
 
 
 Diffs
 -
 
   src/mem/ruby/buffers/MessageBuffer.cc UNKNOWN
   src/mem/ruby/network/simple/PerfectSwitch.cc UNKNOWN
 
 Diff: http://reviews.m5sim.org/r/505/diff
 
 
 Testing
 ---
 
 
 Thanks,
 
 Nilay
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Functional Access support in Ruby

2011-02-26 Thread Beckmann, Brad
Hi Nilay,

What exactly are you referring to as the underlying processor?  Are you 
referring to real silicon?

Actual hardware doesn't support functional accesses.  Functional accesses are 
unique to gem5 and are completely fake when compared to actual hardware.  gem5 
could support functional accesses by quiescing the system and then perform the 
read or write using the existing timing path.  That would probably be a 
suitable solution if gdb running on the simulated system was the only source of 
dynamic functional accesses.  However, there are other sources of dynamic 
functional accesses and we don't want to always perturb the system when 
performing those accesses.  Thus we need a backdoor that doesn't perturb the 
system.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay Vaish
 Sent: Saturday, February 26, 2011 9:06 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Functional Access support in Ruby
 
 I was thinking about the behavior of functional accesses. Currently in
 gdb
 we can change the value of a program variable. Does that mean the
 underlying processor supports functional accesses? If yes, then we
 should
 already have some knowledge about what is expected from functional
 accesses.
 
 Nilay
 
 
 On Fri, 25 Feb 2011, Beckmann, Brad wrote:
 
  Yes, that is correct.  The RubyPort::M5Port::recvFunctional()
 function is where we need to add the new support.
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
  On Behalf Of Nilay Vaish
  Sent: Friday, February 25, 2011 12:20 PM
  To: m5-dev@m5sim.org
  Subject: [m5-dev] Functional Access support in Ruby
 
  Brad,
 
  Here is my understanding of the current state of functional accesses
 in gem5.
  As of now, all functional accesses are forwarded to the
 PhysicalMemory's
  MemoryPort. Instead, we would like to add
  recvFunctional() function to M5Port of the RubyPort, and attach this
 port as
  peer instead of the PhysicalMemory.
 
  --
  Nilay
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Functional Access support in Ruby

2011-02-25 Thread Beckmann, Brad
Yes, that is correct.  The RubyPort::M5Port::recvFunctional() function is where 
we need to add the new support.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Friday, February 25, 2011 12:20 PM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] Functional Access support in Ruby
 
 Brad,
 
 Here is my understanding of the current state of functional accesses in gem5.
 As of now, all functional accesses are forwarded to the PhysicalMemory's
 MemoryPort. Instead, we would like to add
 recvFunctional() function to M5Port of the RubyPort, and attach this port as
 peer instead of the PhysicalMemory.
 
 --
 Nilay
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-25 Thread Beckmann, Brad
It sounds like we are in agreement here, but I just want to make sure we 
clarify one item.  I don't believe simply checking the coherence permissions at 
commit time can sufficiently support stronger consistency models like SC/TSO.  
Instead you need to really need to know whether you've ever lost the block 
since the speculative instruction read it.  Therefore, Ruby really does need to 
forward invalidations to the CPU.

It sounded like from your responses that you understand that as well, but I 
just wanted to make the point clear.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Steve Reinhardt
Sent: Friday, February 25, 2011 10:29 AM
To: M5 Developer List
Subject: Re: [m5-dev] Store Buffer

This sounds right.  Ruby does need to forward invalidations to the CPU since 
some models (including O3) will need to do internal invalidations/flushes to 
maintain consistency.  Others can choose to do it other ways (e.g., by querying 
the L1 at commit as you suggest), but they have the option of ignoring the 
forwarded invalidations, so that's not a problem.

Steve
On Fri, Feb 25, 2011 at 9:07 AM, Arkaprava Basu 
aba...@wisc.edumailto:aba...@wisc.edu wrote:
In sum, I think we all agree that Ruby is going to handle *only non-speculative 
stores*.  M5 CPU model(s) handles all of speculative and non-speculative stores 
that are *yet to be revealed to the memory sub-system*.

To make it clearer, as I understand,  we now have following:

1. All store buffering (speculative and non-speculative) is handled by CPU 
model in M5.
2. Ruby needs to forward intervention/invalidation received at L1 cache 
controller to the CPU model to let it take appropriate action to guarantee 
required memory consistency guarantees (e.g t may need to flush pipeline).
OR
CPU models need to check coherence permission at L1 cache at the commit 
time to know if intervening writes has happened or not (might be required to 
implement stricter model like SC).

I think we need to provide one of the functionality from Ruby side to allow the 
second condition above. Which one to provide depends upon what M5 CPU models 
wants to do to guarantee consistency.

Please let me know if you disagree or if I am missing something.

Thanks
Arka





On 02/24/2011 05:22 PM, Beckmann, Brad wrote:

So I think Steve and I are in agreement here.  We both agree that both 
speculative and non-speculative store buffers should be on the CPU side of the 
RubyPort interface.  I believe that was the same line that existed when Ruby 
tied to Opal in GEMS.  I believe the non-speculative store buffer was only a 
feature used when Opal was not attached, and it was just the simple 
SimicsProcessor driving Ruby.



The sequencer is a separate issue.  Certain functionality of the sequencer can 
probably be eliminated in gem5, but I think other functionality needs to remain 
or at least be moved to some other part of Ruby.  The sequencer performs a lot 
of protocol independent functionality including: updating the actual data 
block, performing synchronization with respect to the cache memory, translating 
m5 packets to ruby requests, checking for per-cacheblock deadlock, and 
coalescing requests to the same cache block.  The coalescing functionality can 
probably be eliminated, but I think the other functionality needs to remain.



Brad





From: m5-dev-boun...@m5sim.orgmailto:m5-dev-boun...@m5sim.org 
[mailto:m5-dev-boun...@m5sim.org

] On Behalf Of Steve Reinhardt

Sent: Thursday, February 24, 2011 1:52 PM

To: M5 Developer List

Subject: Re: [m5-dev] Store Buffer





On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish 
ni...@cs.wisc.edumailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edu
 wrote:

On Thu, 24 Feb 2011, Beckmann, Brad wrote:

Steve, I think we are in agreement here and we may just be disagreeing with the 
definition of speculative.  From the Ruby perspective, I don't think it really 
matters...I don't think there is difference between a speculative store address 
request and a prefetch-with-write-intent. Also we agree that probes will need 
to be sent to O3 LSQ to support the consistency model.

My point is that if we believe this functionality is required, what is the 
extra overhead of adding a non-speculative store buffer to the O3 model as 
well?  I think that will be easier than trying to incorporate the current Ruby 
non-speculative store buffer into each protocol.



I don't know the O3 LSQ model very well, but I assume it buffers both 
speculative and non-speculative stores.  Are there two different structures in 
Ruby for that?



I think the general issue here is that the dividing line between processor 
and memory system is different in M5 than it was with GEMS. with M5 assuming 
that write buffers, redundant request filtering, etc. all happens in the 
processor.  For example, I know I've had you explain this to me multiple 
times already, but I still don't understand

Re: [m5-dev] Store Buffer

2011-02-24 Thread Beckmann, Brad
So we probably don't want to pass speculative store data to the RubyPort, but 
what about speculative load and store requests?  I suspect we do want to send 
them to the RubyPort before the speculation is confirmed.  That might require 
splitting stores to two separate transactions: the request and the actual data 
write.  Also I suspect that the RubyPort will need to forward probes to the cpu 
models to allow the LSQ to maintain the proper consistency model.  If those two 
things end up being true, then what is the benefit of putting the 
non-speculative store buffer in each protocol, versus just in the o3 cpu model?

I'm not yet ready to advocate that is the right solution.  I just want us to 
think these issues thru before deciding to go down one path or the other.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay Vaish
 Sent: Thursday, February 24, 2011 10:45 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Store Buffer
 
 On Thu, 24 Feb 2011, Arkaprava Basu wrote:
 
  Fundamentally, I wish to handle only non-speculative memory state
 within
  Ruby. Otherwise I think there might be risk of Ruby getting affected
 by the
  CPU model's behavior/nuances. As you suggested, Rubyport may well be
 the line
  dividing speculative and non-speculative state.
 
 I also agree that beyond RubyPort, all the stores should be
 non-speculative.
 
  I haven't looked at the  Store buffer code in libruby and do not know
 how it
  interfaces with the protocols. So sorry, I don't have specific
 answers to
  your questions. I think Derek is the best person to comment on this
 as I
  believe he has used store buffer implementation for his prior
 research.
 
 I think currently the store buffer is not being used at all. I looked
 through GEMS code, and some of the protocols do declare a store buffer,
 but no one makes use of it. In gem5, store buffers are not included
 in the protocol files. In fact, current libruby code does nothing
 useful
 at all.
 
  I do think though, that the highest level (closest to the processor)
 cache
  controller (i.e. *-L1Cache.sm ) need to be made aware of the store
 buffer
  (unless it is hacked to bypass SLICC) .
 
  Thanks
  Arka
 
 
 --
 Nilay
 
  On 02/23/2011 11:29 PM, Beckmann, Brad wrote:
  Sorry, I should have been more clear.  It fundamentally comes down
 to how
  does the Ruby interface help support memory consistency, especially
  considering more realistic buffering between the CPU and memory
 system
  (both speculative and non-speculative).  I'm pretty certain that
 Ruby and
  the RubyPort interface will need be changed.  I just want us to
 fully
  understand the issues before making any changes or removing certain
  options.  So are you advocating that the RubyPort interface be the
 line
  between speculative memory state and non-speculative memory state?
 
  As far as the current Ruby store buffer goes, how does it work with
 the L1
  cache controller?  For instance, if the L1 cache receives a
 probe/forwarded
  request to a block that exists in the non-speculative store buffer,
 what is
  the mechanism to retrieve the up-to-date data from the buffer entry?
 Is
  the mechanism protocol agnostic?
 
  Brad
 
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-24 Thread Beckmann, Brad
Steve, I think we are in agreement here and we may just be disagreeing with the 
definition of speculative.  From the Ruby perspective, I don't think it really 
matters...I don't think there is difference between a speculative store address 
request and a prefetch-with-write-intent.  Also we agree that probes will need 
to be sent to O3 LSQ to support the consistency model.

My point is that if we believe this functionality is required, what is the 
extra overhead of adding a non-speculative store buffer to the O3 model as 
well?  I think that will be easier than trying to incorporate the current Ruby 
non-speculative store buffer into each protocol.

Overall, I guess I'm concluding that we probably can delete the current Ruby 
store buffer.  Do others agree?

Brad


From: Steve Reinhardt [mailto:ste...@gmail.com]
Sent: Thursday, February 24, 2011 11:20 AM
To: M5 Developer List
Cc: Beckmann, Brad
Subject: Re: [m5-dev] Store Buffer


On Thu, Feb 24, 2011 at 11:08 AM, Beckmann, Brad 
brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote:
So we probably don't want to pass speculative store data to the RubyPort, but 
what about speculative load and store requests?  I suspect we do want to send 
them to the RubyPort before the speculation is confirmed.  That might require 
splitting stores to two separate transactions: the request and the actual data 
write.  Also I suspect that the RubyPort will need to forward probes to the cpu 
models to allow the LSQ to maintain the proper consistency model.  If those two 
things end up being true, then what is the benefit of putting the 
non-speculative store buffer in each protocol, versus just in the o3 cpu model?

I'm not yet ready to advocate that is the right solution.  I just want us to 
think these issues thru before deciding to go down one path or the other.

I also support the concept of thinking things through, but I'm also happy to 
comment without having done that yet :-).

My gut instinct is to say that O3 already has an LSQ, so Ruby needs to send 
invalidations up to the core to support the consistency model, and if we do 
that there's no need for a store buffer in Ruby. I'd like to better understand 
the arguments against that approach.  For example, why would we want to send 
stores to Ruby when they are still speculative?  Do we have real examples of 
systems that send the store address to the L1 cache speculatively?  If we want 
to fetch store data more aggressively, wouldn't it be equivalent to generate a 
prefetch-with-write-intent first, then generate the store itself only when it 
commits?  I think there are machines that do that.

Steve
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-24 Thread Beckmann, Brad
So I think Steve and I are in agreement here.  We both agree that both 
speculative and non-speculative store buffers should be on the CPU side of the 
RubyPort interface.  I believe that was the same line that existed when Ruby 
tied to Opal in GEMS.  I believe the non-speculative store buffer was only a 
feature used when Opal was not attached, and it was just the simple 
SimicsProcessor driving Ruby.

The sequencer is a separate issue.  Certain functionality of the sequencer can 
probably be eliminated in gem5, but I think other functionality needs to remain 
or at least be moved to some other part of Ruby.  The sequencer performs a lot 
of protocol independent functionality including: updating the actual data 
block, performing synchronization with respect to the cache memory, translating 
m5 packets to ruby requests, checking for per-cacheblock deadlock, and 
coalescing requests to the same cache block.  The coalescing functionality can 
probably be eliminated, but I think the other functionality needs to remain.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Steve Reinhardt
Sent: Thursday, February 24, 2011 1:52 PM
To: M5 Developer List
Subject: Re: [m5-dev] Store Buffer


On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish 
ni...@cs.wisc.edumailto:ni...@cs.wisc.edu wrote:
On Thu, 24 Feb 2011, Beckmann, Brad wrote:
Steve, I think we are in agreement here and we may just be disagreeing with the 
definition of speculative.  From the Ruby perspective, I don't think it really 
matters...I don't think there is difference between a speculative store address 
request and a prefetch-with-write-intent. Also we agree that probes will need 
to be sent to O3 LSQ to support the consistency model.
My point is that if we believe this functionality is required, what is the 
extra overhead of adding a non-speculative store buffer to the O3 model as 
well?  I think that will be easier than trying to incorporate the current Ruby 
non-speculative store buffer into each protocol.

I don't know the O3 LSQ model very well, but I assume it buffers both 
speculative and non-speculative stores.  Are there two different structures in 
Ruby for that?

I think the general issue here is that the dividing line between processor 
and memory system is different in M5 than it was with GEMS. with M5 assuming 
that write buffers, redundant request filtering, etc. all happens in the 
processor.  For example, I know I've had you explain this to me multiple 
times already, but I still don't understand why we still need Ruby sequencers 
either :-).

Brad, I raise the same point that Arka raised earlier. Other processor models 
can also make use of store buffer. So, why only O3 should have a store buffer?

Nilay, I think that's a different issue... we're not saying that other CPU 
models can't have store buffers, but in practice, the simple CPU models block 
on memory accesses so they don't need one.  If the inorder model wants to add a 
store buffer (if it doesn't already have one), it would be an internal decision 
for them whether they want to write one from scratch or try to reuse the O3 
code.  There are already some shared structures in src/cpu like branch 
predictors that can be reused across CPU models.

So in other words we need to decide first where the store buffer should live 
(CPU or memory system) and then we can worry about how to reuse that code if 
that's useful.
Steve
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] MOESI_CMP_directory-perfectDir.sm

2011-02-23 Thread Beckmann, Brad
Since I haven't heard any objections, I'm going to go ahead and remove it.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Beckmann, Brad
Sent: Tuesday, February 22, 2011 2:37 PM
To: Default (m5-dev@m5sim.org)
Subject: [m5-dev] MOESI_CMP_directory-perfectDir.sm

Hi All,

I just posted a patch that removes all of the protocol files that are not 
supported in gem5.  However, I'm not sure if anyone has used/is using the file 
MOESI_CMP_directory-perfectDir.sm.  I've never used it before and I have no 
idea if it even works or what exactly it is supposed to do.

Do people mind if I just remove it?  I'll post the same question to the user 
list.

Brad
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-23 Thread Beckmann, Brad
That's a good question.  Before we get rid of it, we should decide what is the 
interface between Ruby and the o3 LSQ.  I don't know how the current o3 LSQ 
works, but I image that we need to pass probe requests through the RubyPort to 
make it work correctly.

Does anyone with knowledge of the o3 LSQ have a suggestion?

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Wednesday, February 23, 2011 4:51 PM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] Store Buffer
 
 Brad,
 
 In case we remove libruby, what becomes of the store buffer? In fact, is
 store buffer in use?
 
 Thanks
 Nilay
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


[m5-dev] MOESI_CMP_directory-perfectDir.sm

2011-02-22 Thread Beckmann, Brad
Hi All,

I just posted a patch that removes all of the protocol files that are not 
supported in gem5.  However, I'm not sure if anyone has used/is using the file 
MOESI_CMP_directory-perfectDir.sm.  I've never used it before and I have no 
idea if it even works or what exactly it is supposed to do.

Do people mind if I just remove it?  I'll post the same question to the user 
list.

Brad
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] CacheController's wakeup function

2011-02-16 Thread Beckmann, Brad
Hi Nilay,

I'm not quite sure what you mean by appended to while you drain, but I think 
you are asking whether the input ports will receive messages that are scheduled 
for the same cycle as the current cycle.  Is that right?  If so, then you are 
correct, that should not happen.

As long as the input ports are evaluated in the current order of priority, 
you're change looks good to me. In the past, one could limit the loop 
iterations per cycle to approximate cache port contention.  Therefore the 
higher priority ports must be listed first to avoid mandatory requests starving 
external responses.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay Vaish
 Sent: Tuesday, February 15, 2011 9:09 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] CacheController's wakeup function
 
 On Tue, 15 Feb 2011, nathan binkert wrote:
 
  While I don't know anything about this code it looks a little suspect
  to me. Is there really a while (true) or is there some sort of while
  (!empty)?  Can the queues be appended to while you drain?  If these
  are both true, then you'll lose some of your enqueued messages.
 
  Sorry if I'm uninformed.
 
 
 It is a while(true), and there is break statement which is executed in
 case none of the queues have any messages. I am almost certain that the
 incoming queues do not get appended to while they are being drained, I
 would like Brad to confirm this.
 
 --
 Nilay
 
  I thought of this a moment ago, so I have not confirmed this
 empirically.
  The CacheController's wakeup function includes a while loop, in
 which all
  the queues are checked. Consider the Hammer protocol's L1 Cache
 Controller.
  It has four incoming queues - trigger, response, forward, mandatory.
 The
  wakeup function looks like this --
 
  while(true)
  {
   process trigger queue;
   process response queue;
   process forward queue;
   process mandatory queue;
  }
 
  where process means processing a single message from the queue.
  I expect most of the messages to be present in the mandatory queue
 which
  processes the actually loads and stores issued by the associated
 processor.
  Would the following be better --
 
  while(true) process trigger queue;
  while(true) process response queue;
  while(true) process forward queue;
  while(true) process mandatory queue;
 
  I do not expect any improvement in case of FS profiling as most of
 the
  times, the mandatory queue has only one single message. But for
 testing
  protocols using ruby random tester, I do expect some improvement. In
 FS
  profile, after the histogram function (which takes about 8% time),
 the
  wakeup function's execution time is the highest (about 5%). For ruby
 random
  tester profile, the wakeup function takes about 11% of the time.
 
  --
  Nilay
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] MOESI Hammer Protocol Deadlock

2011-02-10 Thread Beckmann, Brad
Hi Nilay,

Thanks for the heads up.  I looked into it and there is a simple fix.

I'm pushing the fix momentarily.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Thursday, February 10, 2011 5:40 AM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] MOESI Hammer Protocol Deadlock
 
 Hi Brad,
 
 I think MOESI hammer protocol has a deadlock scenario. Try the following -
 
   hg update -r 7922
   scons USE_MYSQL=False RUBY=True CC=gcc44 CXX=g++44 NO_HTML=True
 --no-colors build/ALPHA_SE_MOESI_hammer/m5.fast
 ./build/ALPHA_SE_MOESI_hammer/m5.fast
 ./configs/example/ruby_random_test.py -n 4 -l 200
 
 --
 Nilay
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-10 Thread Beckmann, Brad
H Malek,

Hmm...I have never seen that type of error before.  As you mentioned, I don't 
think any of my recent patches changed how DMA is executed for ALPHA_FS.

How long does it take for you to encounter the error?  It would be great if you 
could tell me how I can reproduce the error.  I would like to look at this in 
more detail and get a protocol trace of what is going on.

Thanks,

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Malek Musleh
 Sent: Thursday, February 10, 2011 5:05 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
 
 Hi Brad,
 
 I tested your latest changeset, and it seems that it 'solves' the
 handleResponse error I was getting when running 3 or more cores, but the
 dma_expiry error is still there.
 
 Such that, now the error is consistent, no matter what number of cores I try
 to run with:
 
 For more information see: http://www.m5sim.org/warn/3e0eccba
 panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1  @ cycle
 62411238889001
 [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc,
 line 323] Memory Usage: 382600 KBytes
 
 - M5 Terminal ---
 hda: max request size: 128KiB
 hda: 101808 sectors (52 MB), CHS=101/16/63
  hda:4hda: dma_timer_expiry: dma status == 0x65
 hda: DMA interrupt recovery
 hda: lost interrupt
  unknown partition table
 hdb: max request size: 128KiB
 hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
  hdb:4hdb: dma_timer_expiry: dma status == 0x65
 hdb: DMA interrupt recovery
 hdb: lost interrupt
 
 The panic error seems to suggest an inconsistent DMA state, so I tried
 reverting to an older changeset (before DMA changes were pushed out)
 such as 7936, and even 7930 but no such luck.
 
 The changeset that I know works from last week or so is changeset 7842.
 Looking at the changset summaries between 7842 and 7930 seem to indicate
 a lot of changes 'unrelated' to the DMA, such as O3, InOrderCPU, and x86
 changes. That being said, I did not do a diff on those intermediate changesets
 to verify that maybe a related file was slightly modified in the process.
 
 I might be able to spend some more time trying changesets till I narrow down
 which one its coming from, but maybe the new panic message might give
 you some indication on how to fix it?
 
 (I think the panic messaged appeared now and not before because I let the
 simulation terminate itself when running overnight as opposed to me killing it
 once I saw the dma_expiry message on the M5 Terminal).
 
 Malek
 
 On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad
 brad.beckm...@amd.com wrote:
  Hi Malek,
 
  Yes, thanks for letting us know.  I'm pretty sure I know what the problem
 is.  Previously, if a SC operation failed, the RubyPort would convert the
 request packet to a response packet, bypassed writing the functional view of
 memory, and pass it back up to the CPU.  In my most recent patches I
 generalized the mechanism that converts request packets to response
 packets and avoids writing functional memory.  However, I forgot to remove
 the duplicate request to response conversion for failed SC
 requests.  Therefore, I bet you are encounter that assertion error on that
 duplicate call.  It should be a simple one line change that fixes your
 problem.  I'll push it momentarily and it would be great if you could confirm
 that my change does indeed fix your problem.
 
  Brad
 
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Gabe Black
  Sent: Wednesday, February 09, 2011 3:54 PM
  To: M5 Developer List
  Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
 
  Thanks for letting us know. If it wouldn't be too much trouble, could
  you please try some other changesets near the one that isn't working
  and try to determine which one specifically broke things? A bunch of
  changes went in recently so it would be helpful to narrow things
  down. I'm not very involved with Ruby right now personally, but I
  assume that would be useful information for the people that are.
 
  Gabe
 
  On 02/09/11 14:51, Malek Musleh wrote:
   Hello,
  
   I first started using the Ruby Model in M5  about a week or so ago,
   and was able to boot in FS mode (up to 64 cores once applying the
   BigTsunami patches).
  
   In order to keep up with the changes in the Ruby code, I have
   started fetching recent updates from the devrepo.
  
   However, in fetching the updates to the recent changesets (from the
   last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
   and MOESI_CMP_directory.
  
   If running 2 cores or less I get this at the terminal screen after
   letting it run for some time:
  
   hda: M5 IDE Disk, ATA DISK drive
   hdb: M5 IDE Disk, ATA DISK drive
   hda: UDMA/33 mode selected
   hdb: UDMA/33 mode selected
   ide0 at 0x8410-0x8417,0x8422 on irq 31
   ide1 at 0x8418

Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-10 Thread Beckmann, Brad
: listening for remote gdb #2 on port
  7002
  0: system.remote_gdb.listener: listening for remote gdb #3 on port
  7003
   REAL SIMULATION 
  info: Entering event queue @ 0.  Starting simulation...
  info: Launching CPU 1 @ 835461000
  info: Launching CPU 2 @ 846156000
  info: Launching CPU 3 @ 856768000
  warn: Prefetch instrutions is Alpha do not do anything For more
  information see: http://www.m5sim.org/warn/3e0eccba
  1349195500: system.terminal: attach terminal 0
  warn: Prefetch instrutions is Alpha do not do anything For more
  information see: http://www.m5sim.org/warn/3e0eccba
  m5.opt:
 build/ALPHA_FS_MOESI_CMP_directory/mem/ruby/system/RubyPort.cc:23
 0:
  virt\
  ual bool RubyPort::M5Port::recvTiming(Packet*): Assertion
  `Address(ruby_request.\
  paddr).getOffset() + ruby_request.len =
  RubySystem::getBlockSizeBytes()' failed\ .
  Program aborted at cycle 2406378289516 Aborted
 
 
  The same error occurs for 7907 - 7908.
 
  At changeset 7909 is where the dma_expiry error first shows up:
 
  7909:
 
  hda: M5 IDE Disk, ATA DISK drive
  hdb: M5 IDE Disk, ATA DISK drive
  hda: UDMA/33 mode selected
  hdb: UDMA/33 mode selected
  ide0 at 0x8410-0x8417,0x8422 on irq 31
  ide1 at 0x8418-0x841f,0x8426 on irq 31
  ide_generic: please use probe_mask=0x3f module parameter for
  probing all legac\ y ISA IDE ports
  ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
  ide3 at 0x170-0x177,0x376 on irq 15
  hda: max request size: 128KiB
  hda: 101808 sectors (52 MB), CHS=101/16/63
   hda:4hda: dma_timer_expiry: dma status == 0x65
  hda: DMA interrupt recovery
  hda: lost interrupt
   unknown partition table
  hdb: max request size: 128KiB
  hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
 
  I tested changeset 7920:
 
  and thats where I notice the handleResponse()
 
  7920:
 
  M5 compiled Feb 10 2011 14:49:49
  M5 revision 39c86a8306d2+ 7920+ default
  M5 started Feb 10 2011 14:53:38
  M5 executing on sherpa05
  command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt
  ./configs/example/ruby\
  _fs.py -n 4 --topology Crossbar
  Global frequency set at 1 ticks per second
  info: kernel located at:
  /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
  Listening for system connection on port 3456
0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1
  00:00:00 2009
  0: system.remote_gdb.listener: listening for remote gdb #0 on port
  7000
  0: system.remote_gdb.listener: listening for remote gdb #1 on port
  7001
  0: system.remote_gdb.listener: listening for remote gdb #2 on port
  7002
  0: system.remote_gdb.listener: listening for remote gdb #3 on port
  7003
   REAL SIMULATION 
  info: Entering event queue @ 0.  Starting simulation...
  info: Launching CPU 1 @ 835461000
  info: Launching CPU 2 @ 846156000
  info: Launching CPU 3 @ 856768000
  warn: Prefetch instrutions is Alpha do not do anything For more
  information see: http://www.m5sim.org/warn/3e0eccba
  1128875500: system.terminal: attach terminal 0
  warn: Prefetch instrutions is Alpha do not do anything For more
  information see: http://www.m5sim.org/warn/3e0eccba
  m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/packet.hh:590:
 void
  Packet::makeResponse(): Assertion `needsResponse()' failed.
  Program aborted at cycle 36235566500
  Aborted
 
  Note that I have not tested changesets 7911-7918.
 
  I have tested the MOESI_CMP_directory protocol on all of these with
  m5.opt. I have testes using MESI_CMP_directory for some of them and
  got the same messages.
 
  This is my command line:
 
  ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt -
  ./configs/example/ruby_fs.py -n 4 --topology Crossbar
 
  The error comes at about 15 minutes in to boot the kernel. Note that
  it takes a while for the io to be scheduled.
 
  io scheduler noop registered
  io scheduler anticipatory registered
  io scheduler deadline registered
  io scheduler cfq registered (default)
 
  In all cases though where the dma_expiry occurs (which does not
  include changesets 7906-7908), the last thing that appears is this:
 
  ide0 at 0x8410-0x8417,0x8422 on irq 31
  ide1 at 0x8418-0x841f,0x8426 on irq 31
  ide_generic: please use probe_mask=0x3f module parameter for
  probing all legacy ISA IDE ports
  ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
  ide3 at 0x170-0x177,0x376 on irq 15
  hda: max request size: 128KiB
  hda: 101808 sectors (52 MB), CHS=101/16/63
   hda:4hda: dma_timer_expiry: dma status == 0x65
  hda: DMA interrupt recovery
  hda: lost interrupt
   unknown partition table
  hdb: max request size: 128KiB
  hdb: 4177920 sectors (2139 MB), CHS=4144/16/63
 
  Is it possible to generate a trace for Ruby in M5 the way it is for
  Ruby in GEMS like something of this sort:
 
  http://www.cs.wisc.edu/gems/doc/gems-
 wiki/moin.cgi/How_do_I_understan
  d_a_Protocol
 
  ?
 
  Let me know if you need anymore information.
 
  Malek
 
  On Thu, Feb 10, 2011 at 4:43 PM, Beckmann, Brad
 brad.beckm...@amd.com wrote:
  H Malek,
 
  Hmm...I have never seen that type

Re: [m5-dev] Ruby FS Fails with recent Changesets

2011-02-09 Thread Beckmann, Brad
Hi Malek,

Yes, thanks for letting us know.  I'm pretty sure I know what the problem is.  
Previously, if a SC operation failed, the RubyPort would convert the request 
packet to a response packet, bypassed writing the functional view of memory, 
and pass it back up to the CPU.  In my most recent patches I generalized the 
mechanism that converts request packets to response packets and avoids writing 
functional memory.  However, I forgot to remove the duplicate request to 
response conversion for failed SC requests.  Therefore, I bet you are encounter 
that assertion error on that duplicate call.  It should be a simple one line 
change that fixes your problem.  I'll push it momentarily and it would be great 
if you could confirm that my change does indeed fix your problem.

Brad



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabe Black
 Sent: Wednesday, February 09, 2011 3:54 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets
 
 Thanks for letting us know. If it wouldn't be too much trouble, could you
 please try some other changesets near the one that isn't working and try to
 determine which one specifically broke things? A bunch of changes went in
 recently so it would be helpful to narrow things down. I'm not very involved
 with Ruby right now personally, but I assume that would be useful
 information for the people that are.
 
 Gabe
 
 On 02/09/11 14:51, Malek Musleh wrote:
  Hello,
 
  I first started using the Ruby Model in M5  about a week or so ago,
  and was able to boot in FS mode (up to 64 cores once applying the
  BigTsunami patches).
 
  In order to keep up with the changes in the Ruby code, I have started
  fetching recent updates from the devrepo.
 
  However, in fetching the updates to the recent changesets (from the
  last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory
  and MOESI_CMP_directory.
 
  If running 2 cores or less I get this at the terminal screen after
  letting it run for some time:
 
  hda: M5 IDE Disk, ATA DISK drive
  hdb: M5 IDE Disk, ATA DISK drive
  hda: UDMA/33 mode selected
  hdb: UDMA/33 mode selected
  ide0 at 0x8410-0x8417,0x8422 on irq 31
  ide1 at 0x8418-0x841f,0x8426 on irq 31
  ide_generic: please use probe_mask=0x3f module parameter for probing
  all legacy ISA IDE ports
  ide2 at 0x1f0-0x1f7,0x3f6 on irq 14
  ide3 at 0x170-0x177,0x376 on irq 15
  hda: max request size: 128KiB
  hda: 101808 sectors (52 MB), CHS=101/16/63
   hda:4hda: dma_timer_expiry: dma status == 0x65
  --- problem
 
 
  When running 3 or more cores, I get the following assertion failure:
 
 
  info: kernel located at:
  /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux
  Listening for system connection on port 3456
0: system.tsunami.io.rtc: Real-time clock set to Thu Jan  1
  00:00:00 2009
  0: system.remote_gdb.listener: listening for remote gdb #0 on port
  7000
  0: system.remote_gdb.listener: listening for remote gdb #1 on port
  7001
  0: system.remote_gdb.listener: listening for remote gdb #2 on port
  7002
  0: system.remote_gdb.listener: listening for remote gdb #3 on port
  7003
   REAL SIMULATION 
  info: Entering event queue @ 0.  Starting simulation...
  info: Launching CPU 1 @ 834794000
  info: Launching CPU 2 @ 845489000
  info: Launching CPU 3 @ 856101000
  m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590: void
  Packet::makeResponse(): Assertion `needsResponse()' failed.
  Program aborted at cycle 97716
  Aborted
 
  The top of the tree is this last changeset:
 
  changeset:   7939:215c8be67063
  tag: tip
  user:Brad Beckmann brad.beckm...@amd.com
  date:Tue Feb 08 18:07:54 2011 -0800
  summary: regess: protocol regression tester updates
 
  I am not sure if those whom it concern are aware of it or not, or if
  there will be a soon to be updated changeset already in the works for
  this or not, but I figured I would bring it to your attention.
 
  Malek
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick

2011-02-08 Thread Beckmann, Brad
Hi Gabe,

Since you successfully updated the tests I can't run (ARM_FS), I can take of 
the remaining errors (i.e. ruby protocol tests).  I have a few minor fixes I 
want to check in that I need to run the regression tester against anyways.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabe Black
 Sent: Tuesday, February 08, 2011 12:15 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-
 regression quick
 
 Hmm. I didn't realize all the build targets for ruby protocols had their own
 separate regressions. I'll have to run those too.
 
 Gabe
 
 On 02/08/11 00:17, Cron Daemon wrote:
  *
 build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru6
 4/simple-timing-ruby-MESI_CMP_directory FAILED!
  *
 build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/linux
 /simple-timing-ruby-MESI_CMP_directory FAILED!
  *
 build/ALPHA_SE_MOESI_hammer/tests/fast/quick/60.rubytest/alpha/linux/
 rubytest-ruby-MOESI_hammer FAILED!
  *
 build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/50.memtest/alpha/l
 inux/memtest-ruby-MESI_CMP_directory FAILED!
  *
 build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/tru64/sim
 ple-timing-ruby-MOESI_hammer FAILED!
  *
 build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/linux/sim
 ple-timing-ruby-MOESI_hammer FAILED!
  *
 build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/lin
 ux/simple-timing-ruby-MOESI_CMP_directory FAILED!
  *
 build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru
 64/simple-timing-ruby-MOESI_CMP_directory FAILED!
  *
 build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/60.rubytest/alpha/lin
 ux/rubytest-ruby-MOESI_CMP_token FAILED!
  *
 build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/tru64/
 simple-timing-ruby-MOESI_CMP_token FAILED!
  *
 build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/linux/
 simple-timing-ruby-MOESI_CMP_token FAILED!
  *
 build/ALPHA_SE_MOESI_hammer/tests/fast/quick/50.memtest/alpha/linux
 /memtest-ruby-MOESI_hammer FAILED!
  *
 build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/50.memtest/alpha/li
 nux/memtest-ruby-MOESI_CMP_token FAILED!
  *
 build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/50.memtest/alpha
 /linux/memtest-ruby-MOESI_CMP_directory FAILED!
  scons: *** Source `tests/quick/01.hello-2T-smt/ref/alpha/linux/o3-
 timing/stats.txt' not found, needed by target
 `build/ALPHA_SE/tests/fast/quick/01.hello-2T-smt/alpha/linux/o3-
 timing/status'.
  * build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest-
 ruby passed.
  * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-
 timing-ruby passed.
  * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/o3-timing
 passed.
  * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-
 atomic-mp passed.
  * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple-
 timing-mp passed.
  *
 build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/li
 nux/rubytest-ruby-MESI_CMP_directory passed.
  * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-
 atomic passed.
  * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-
 timing passed.
  * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple-
 atomic passed.
  * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-
 atomic passed.
  * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple-
 timing-ruby passed.
  * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple-
 timing passed.
  * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple-
 timing passed.
  * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing
 passed.
  * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder-
 timing passed.
  *
 build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/60.rubytest/alpha
 /linux/rubytest-ruby-MOESI_CMP_directory passed.
  *
 build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby
 passed.
  * build/ALPHA_FS/tests/fast/quick/10.linux-
 boot/alpha/linux/tsunami-simple-timing passed.
  * build/ALPHA_FS/tests/fast/quick/10.linux-
 boot/alpha/linux/tsunami-simple-timing-dual passed.
  * build/ALPHA_FS/tests/fast/quick/10.linux-
 boot/alpha/linux/tsunami-simple-atomic-dual passed.
  * build/ALPHA_FS/tests/fast/quick/80.netperf-
 stream/alpha/linux/twosys-tsunami-simple-atomic passed.
  * build/ALPHA_FS/tests/fast/quick/10.linux-
 boot/alpha/linux/tsunami-simple-atomic passed.
  *
 build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest passed.
  * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-atomic
 passed.
  * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing-
 ruby passed.
  * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing
 

Re: [m5-dev] Missing _ in ruby_fs.py

2011-02-08 Thread Beckmann, Brad
Ah, yes I did.  This actually reminds me that I need to fix how dma devices are 
connected within Ruby for x86_FS.  I'll push a batch that fixes these issues 
soon.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Tuesday, February 08, 2011 9:54 AM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] Missing _ in ruby_fs.py
 
 Hi Brad, did you miss out on the '_' in _dma_devices?
 
 --
 Nilay
 
 
 diff -r 6f5299ff8260 -r 00ad807ed2ca configs/example/ruby_fs.py
 --- a/configs/example/ruby_fs.pySun Feb 06 22:14:18 2011 -0800
 +++ b/configs/example/ruby_fs.pySun Feb 06 22:14:18 2011 -0800
 @@ -109,12 +109,19 @@
 
   CPUClass.clock = options.clock
 
 -system = makeLinuxAlphaRubySystem(test_mem_mode, bm[0])
 -
 -system.ruby = Ruby.create_system(options,
 - system,
 - system.piobus,
 - system._dma_devices)
 +if buildEnv['TARGET_ISA'] == alpha:
 +system = makeLinuxAlphaRubySystem(test_mem_mode, bm[0])
 +system.ruby = Ruby.create_system(options,
 + system,
 + system.piobus,
 + system.dma_devices) elif
 +buildEnv['TARGET_ISA'] == x86:
 +system = makeLinuxX86System(test_mem_mode, options.num_cpus,
 bm[0],
 True)
 +system.ruby = Ruby.create_system(options,
 + system,
 + system.piobus)
 +else:
 +fatal(incapable of building non-alpha or non-x86 full system!)
 
   system.cpu = [CPUClass(cpu_id=i) for i in xrange(options.num_cpus)]
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol

2011-02-08 Thread Beckmann, Brad
Hi Korey,

Just to clarify, the deadlock threshold in the sequencer is different than the 
deadlock threshold in the mem tester.  The sequencer's deadlock mechanism 
detects whether any particular request takes longer than the threshold.  
Meanwhile the mem tester deadlock threshold just ensures that a particular cpu 
sees at least one request complete within the deadlock threshold.  I don't 
think we want to degrade the deadlock checker to just a warning.  While in this 
particular case, the deadlock turned out to be just a performance issue, in my 
experience the vast majority of potential deadlock detections turn out to be 
real bugs.

Later today I'll check in patch that increases the ruby mem test deadlock 
threshold.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Korey Sewell
Sent: Monday, February 07, 2011 2:27 PM
To: M5 Developer List
Subject: Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol

Another followup on this is that the deadlock_threshold parameter doesnt 
propagate to the MemTester CPU.

So when I'm testing 64 CPUS, the memtester.cc still has this code:
if (!tickEvent.scheduled())
schedule(tickEvent, curTick() + ticks(1));

if (++noResponseCycles = 50) {
if (issueDmas) {
cerr  DMA tester ;
}
cerr  name()  : deadlocked at cycle   curTick()  endl;
fatal();
}


That hardcoded 50 is not a great number (as people have said) because as 
your topologies/mem. hierarchies change, then the max # of cycles that you have 
to wait for a response can also change, right?

Increasing that # by hand is a arduous thing to do, so maybe that # should come 
off a parameter, as well as maybe we should warn there that a deadlock is 
possible after some type of inordinate wait time.

The fix should be just to warn about a long wait after an inordinate 
period...Something like this I think:

if (++noResponseCycles % 50 == 0) {
  warn(cpu X has waited for %i cycles, noResponseCycles);
}


Lastly, should the memtester really send out a memory access on every tick? The 
actual injection rate could be much higher than the rate at which we resolve 
contention.

Maybe we should consider having X many outstanding requests per CPU as a more 
realistic measure that can stress the system but not make the noResponseCycles 
stat (?) grow to such an high number..
On Mon, Feb 7, 2011 at 1:27 PM, Beckmann, Brad 
brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote:
Yep, if I increase the deadlock threshold to 5 million cycles, the deadlock 
warning is not encountered.  However, I don't think that we should increase the 
default deadlock threshold to by an order-of-magnitude.  Instead, let's just 
increase the threashold for the mem tester.  How about I check in the following 
small patch.

Brad


diff --git a/configs/example/ruby_mem_test.py b/configs/example/ruby_mem_test.py
--- a/configs/example/ruby_mem_test.py
+++ b/configs/example/ruby_mem_test.py
@@ -135,6 +135,12 @@
cpu.test = system.ruby.cpu_ruby_ports[i].port
cpu.functional = system.funcmem.port

+#
+# Since the memtester is incredibly bursty, increase the deadlock
+# threshold to 5 million cycles
+#
+system.ruby.cpu_ruby_ports[i].deadlock_threshold = 500
+
 for (i, dma) in enumerate(dmas):
#
# Tie the dma memtester ports to the correct functional port
diff --git a/tests/configs/memtest-ruby.py b/tests/configs/memtest-ruby.py
--- a/tests/configs/memtest-ruby.py
+++ b/tests/configs/memtest-ruby.py
@@ -96,6 +96,12 @@
 #
 cpus[i].test = ruby_port.port
 cpus[i].functional = system.funcmem.port
+
+ #
+ # Since the memtester is incredibly bursty, increase the deadlock
+ # threshold to 5 million cycles
+ #
+ ruby_port.deadlock_threshold = 500

 # ---
 # run simulation



 -Original Message-
 From: m5-dev-boun...@m5sim.orgmailto:m5-dev-boun...@m5sim.org 
 [mailto:m5-dev-boun...@m5sim.orgmailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Monday, February 07, 2011 9:12 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory
 protocol

 Brad, I also see the protocol getting into a dead lock. I tried to get a 
 trace, but
 I get segmentation fault (yes, the segmentation fault only occurs when trace
 flag ProtocolTrace is supplied). It seems to me that memory is getting
 corrupted somewhere, because the fault occurs in malloc it self.

 It could be that protocol is actually not in a dead lock. Both Arka and I had
 increased the deadlock threashold while testing the protocol. I will try with
 increased threashold later in the day.

 One more thing, the Orion 2.0 code that was committed last night makes use
 of printf(). It did not compile cleanly for me. I had change it fatal() and 
 include
 the header file base/misc.hh.

 --
 Nilay

 On Mon, 7 Feb 2011, Beckmann, Brad wrote:

  FYI

Re: [m5-dev] changeset in m5: regress: Regression Tester output updates

2011-02-07 Thread Beckmann, Brad
Ugh...sorry about that.  I had to update most of the stats because one of 
Joel's patches added several new stats.  The problem was that I don't have the 
Linux kernel to run the ARM FS regression tests.  Therefore those tests didn't 
run correctly and thus I incorrectly updated those regression output files.  A 
similar problem occurred for the X86_SE  o3 test.

There is no excuse for my incorrect update of these regression output files.  
However, one thing that will help me in the future is making sure that all of 
us have the capability to run all regress tests.  Many of us, including myself, 
don't have log in access to zizzer at Michigan, and thus it is very hard for me 
to reproduce the environment on zizzer, including external file dependencies.

Thanks,

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabe Black
 Sent: Monday, February 07, 2011 12:47 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] changeset in m5: regress: Regression Tester output
 updates

 I'm rolling back this stat update and rerunning/reupdating. It's going to 
 take a
 while, but I'll push once it's done.

 Gabe

 On 02/06/11 23:03, Ali Saidi wrote:
  This seems like a really half baked attempt to update the stats. You've
 removed all the ARM FS stats, and you're only seem to have updated a few
 of the quick tests, however all of the other tests aren't updated. Since you
 added some stats every test will need an update.
 
  Ali
 
 
 
  On Feb 7, 2011, at 12:17 AM, Brad Beckmann wrote:
 
  changeset 05f52a716144 in /z/repo/m5
  details: http://repo.m5sim.org/m5?cmd=changeset;node=05f52a716144
  description:
 regress: Regression Tester output updates
 
  diffstat:
 
  tests/quick/00.hello/ref/alpha/linux/inorder-timing/config.ini
 |13 +-
  tests/quick/00.hello/ref/alpha/linux/inorder-timing/simout
 | 8 +-
  tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt
 |10 +-
  tests/quick/00.hello/ref/alpha/linux/o3-timing/config.ini  
  |
 13 +-
  tests/quick/00.hello/ref/alpha/linux/o3-timing/simout  
  |
 8 +-
  tests/quick/00.hello/ref/alpha/linux/o3-timing/stats.txt   
  |
 31 +-
  tests/quick/00.hello/ref/alpha/linux/simple-atomic/config.ini
 |11 +-
  tests/quick/00.hello/ref/alpha/linux/simple-atomic/simout
 | 8 +-
  tests/quick/00.hello/ref/alpha/linux/simple-atomic/stats.txt
 |24 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MESI_CMP_directory/config.ini  |14 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MESI_CMP_directory/ruby.stats  |30 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MESI_CMP_directory/simout  | 8 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MESI_CMP_directory/stats.txt   |26 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_CMP_directory/config.ini |68 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_CMP_directory/ruby.stats |54 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_CMP_directory/simout | 8 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_CMP_directory/stats.txt  |26 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_CMP_token/config.ini |68 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_CMP_token/ruby.stats |94 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_CMP_token/simout | 8 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_CMP_token/stats.txt  |26 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_hammer/config.ini|97 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_hammer/ruby.stats|   164 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_hammer/simout|10 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby-
 MOESI_hammer/stats.txt |30 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby/config.ini
 |   228 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby/ruby.stats
 |   282 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby/simout
 | 8 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby/stats.txt
 |26 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing/config.ini
 |13 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing/simout
 |12 +-
  tests/quick/00.hello/ref/alpha/linux/simple-timing/stats.txt
 |26 +-
  tests/quick/00.hello/ref/alpha/tru64/o3-timing/config.ini  
  |
 13 +-
  tests/quick/00.hello/ref/alpha/tru64/o3-timing/simout  
  |
 8 +-
  tests/quick/00.hello/ref/alpha/tru64/o3-timing/stats.txt   
  |
 30 +-
  

Re: [m5-dev] changeset in m5: ruby: add stdio header in SRAM.hh

2011-02-07 Thread Beckmann, Brad
I agree Nilay.  Do you want to push that patch, or would you like me to take 
care of it?  Ideally Tushar should do it, but since he's in Singapore it is 
probably best that you or I do it.

Thanks for pointing that out.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Monday, February 07, 2011 9:23 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] changeset in m5: ruby: add stdio header in SRAM.hh
 
 Korey, I think the printf statements should be replaced with fatal() or
 panic() instead.
 
 --
 Nilay
 
 
 On Mon, 7 Feb 2011, Korey Sewell wrote:
 
  changeset 5f2a2deb377d in /z/repo/m5
  details: http://repo.m5sim.org/m5?cmd=changeset;node=5f2a2deb377d
  description:
  ruby: add stdio header in SRAM.hh
  missing header file caused RUBY_FS to not compile
 
  diffstat:
 
  src/mem/ruby/network/orion/Buffer/SRAM.hh |  1 +
  1 files changed, 1 insertions(+), 0 deletions(-)
 
  diffs (11 lines):
 
  diff -r 2c2dc567a450 -r 5f2a2deb377d
 src/mem/ruby/network/orion/Buffer/SRAM.hh
  --- a/src/mem/ruby/network/orion/Buffer/SRAM.hh Mon Feb 07
 01:23:16 2011 -0800
  +++ b/src/mem/ruby/network/orion/Buffer/SRAM.hh Mon Feb 07
 12:19:46 2011 -0500
  @@ -39,6 +39,7 @@
  #include mem/ruby/network/orion/Type.hh
  #include mem/ruby/network/orion/OrionConfig.hh
  #include mem/ruby/network/orion/TechParameter.hh
  +#include stdio.h
 
  class OutdrvUnit;
  class AmpUnit;
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] changeset in m5: ruby: add stdio header in SRAM.hh

2011-02-07 Thread Beckmann, Brad
Hi Nilay,

I assume the printf's that give you problems are the five listed below.  Based 
on my little understanding of orion, I believe you can reach those errors if 
you misconfigure the buffer.  Therefore I do think that fatal is the correct 
call.

Brad


src/mem/ruby/network/orion/Buffer/BitlineUnit.cc:printf(error\n);
src/mem/ruby/network/orion/Buffer/OutdrvUnit.cc:printf(error\n);
src/mem/ruby/network/orion/Buffer/PrechargeUnit.cc:default: 
printf(error\n); return 0;
src/mem/ruby/network/orion/Buffer/PrechargeUnit.cc:default: 
printf(error\n); return 0;
src/mem/ruby/network/orion/Buffer/WordlineUnit.cc:printf(error\n);

 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Monday, February 07, 2011 9:35 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] changeset in m5: ruby: add stdio header in SRAM.hh
 
 I can do it. I have replaced all of the printf()s with fatal()s.
 Is this correct, or should I use panic() instead?
 
 --
 Nilay
 
 
 On Mon, 7 Feb 2011, Beckmann, Brad wrote:
 
  I agree Nilay.  Do you want to push that patch, or would you like me
  to take care of it?  Ideally Tushar should do it, but since he's in
  Singapore it is probably best that you or I do it.
 
  Thanks for pointing that out.
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Nilay Vaish
  Sent: Monday, February 07, 2011 9:23 AM
  To: M5 Developer List
  Subject: Re: [m5-dev] changeset in m5: ruby: add stdio header in
  SRAM.hh
 
  Korey, I think the printf statements should be replaced with fatal()
  or
  panic() instead.
 
  --
  Nilay
 
 
  On Mon, 7 Feb 2011, Korey Sewell wrote:
 
  changeset 5f2a2deb377d in /z/repo/m5
  details:
 http://repo.m5sim.org/m5?cmd=changeset;node=5f2a2deb377d
  description:
ruby: add stdio header in SRAM.hh
missing header file caused RUBY_FS to not compile
 
  diffstat:
 
  src/mem/ruby/network/orion/Buffer/SRAM.hh |  1 +
  1 files changed, 1 insertions(+), 0 deletions(-)
 
  diffs (11 lines):
 
  diff -r 2c2dc567a450 -r 5f2a2deb377d
  src/mem/ruby/network/orion/Buffer/SRAM.hh
  --- a/src/mem/ruby/network/orion/Buffer/SRAM.hh   Mon Feb 07
  01:23:16 2011 -0800
  +++ b/src/mem/ruby/network/orion/Buffer/SRAM.hh   Mon Feb 07
  12:19:46 2011 -0500
  @@ -39,6 +39,7 @@
  #include mem/ruby/network/orion/Type.hh
  #include mem/ruby/network/orion/OrionConfig.hh
  #include mem/ruby/network/orion/TechParameter.hh
  +#include stdio.h
 
  class OutdrvUnit;
  class AmpUnit;
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol

2011-02-07 Thread Beckmann, Brad
Yep, if I increase the deadlock threshold to 5 million cycles, the deadlock 
warning is not encountered.  However, I don't think that we should increase the 
default deadlock threshold to by an order-of-magnitude.  Instead, let's just 
increase the threashold for the mem tester.  How about I check in the following 
small patch.

Brad


diff --git a/configs/example/ruby_mem_test.py b/configs/example/ruby_mem_test.py
--- a/configs/example/ruby_mem_test.py
+++ b/configs/example/ruby_mem_test.py
@@ -135,6 +135,12 @@
 cpu.test = system.ruby.cpu_ruby_ports[i].port
 cpu.functional = system.funcmem.port
 
+#
+# Since the memtester is incredibly bursty, increase the deadlock
+# threshold to 5 million cycles
+#
+system.ruby.cpu_ruby_ports[i].deadlock_threshold = 500
+
 for (i, dma) in enumerate(dmas):
 #
 # Tie the dma memtester ports to the correct functional port
diff --git a/tests/configs/memtest-ruby.py b/tests/configs/memtest-ruby.py
--- a/tests/configs/memtest-ruby.py
+++ b/tests/configs/memtest-ruby.py
@@ -96,6 +96,12 @@
  #
  cpus[i].test = ruby_port.port
  cpus[i].functional = system.funcmem.port
+ 
+ #
+ # Since the memtester is incredibly bursty, increase the deadlock
+ # threshold to 5 million cycles
+ #
+ ruby_port.deadlock_threshold = 500
 
 # ---
 # run simulation



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Monday, February 07, 2011 9:12 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory
 protocol
 
 Brad, I also see the protocol getting into a dead lock. I tried to get a 
 trace, but
 I get segmentation fault (yes, the segmentation fault only occurs when trace
 flag ProtocolTrace is supplied). It seems to me that memory is getting
 corrupted somewhere, because the fault occurs in malloc it self.
 
 It could be that protocol is actually not in a dead lock. Both Arka and I had
 increased the deadlock threashold while testing the protocol. I will try with
 increased threashold later in the day.
 
 One more thing, the Orion 2.0 code that was committed last night makes use
 of printf(). It did not compile cleanly for me. I had change it fatal() and 
 include
 the header file base/misc.hh.
 
 --
 Nilay
 
 On Mon, 7 Feb 2011, Beckmann, Brad wrote:
 
  FYI...If my local regression tests are correct.  This patch does not
  fix all the problems with the MESI_CMP_directory protocol.  One of the
  patches I just checked in fixes a subtle bug in the ruby_mem_test.
  Fixing this bug, exposes more deadlock problems in the
  MESI_CMP_directory protocol.
 
  To reproduce the regression tester's sequencer deadlock error, set the
  Randomization flag to false in the file
  configs/example/ruby_mem_test.py then run the following command:
 
  build/ALPHA_SE_MESI_CMP_directory/m5.debug
  configs/example/ruby_mem_test.py -n 8
 
  Let me know if you have any questions,
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Nilay Vaish
  Sent: Thursday, January 13, 2011 8:50 PM
  To: m5-dev@m5sim.org
  Subject: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory
  protocol
 
  changeset 8f37a23e02d7 in /z/repo/m5
  details: http://repo.m5sim.org/m5?cmd=changeset;node=8f37a23e02d7
  description:
 Ruby: Fixes MESI CMP directory protocol
 The current implementation of MESI CMP directory protocol is
 broken.
 This patch, from Arkaprava Basu, fixes the protocol.
 
  diffstat:
 
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] changeset in m5: regress: Regression Tester output updates

2011-02-07 Thread Beckmann, Brad
Hi Gabe,

Yes, the set of patches I checked in require are a lot of changes to the output 
files.  I scanned parts of the regression tester patch last night and noticed 
those changes as well, including the 5x change (actually it is more like 10x in 
most cases) in simticks for the mem tester.  They all make sense to me.  There 
are multiple patches that impact the regression tester output.  The McPAT cpu 
counter and work unit patches added several new variables to every stats.txt 
file.  That is the major source of changes in those files you listed below 
(minus the memtester).

The large difference in the memtester is something else.  Another one of my 
patches fixed a problem with respect to what block size ruby indicated to the 
cpus.  By fixing this problem, it exposed the fact that ruby did not support 
the retry semantics expected by the cpu models.  I thus added that support, 
which then fixed a major problem in the memtester.  Interestingly enough, 
indicating a block size of 0 to the memtester caused the memtester to issue 
only one request at a time per cpu.  Now the memtester issues as many requests 
as possible to ruby until the sequencer's outstanding request count is reached 
(16 by default).  The significantly higher contention is the reason why the 
memtester simticks increase by 5-10x.

Overall, I am aware of the many changes I made to the regression tester output 
last night.  The problem was that my changes were so significant that I failed 
to realize that I also slipped in completely removing the regression tester 
output files for the ARM_FS and x86 o3 timing tests.  I also noticed last night 
that I couldn't successfully run the ARM_FS and x86 o3 timing tests locally, 
but I figured that was ok since those failures were due to environment issues.  
What I failed to do is put the two things together, and realize that I can't 
update regression tester output if I can't successfully run those tests.  Oh 
well, live and learn.

Thanks Gabe for rerunning those tests and updating the regression tester output!

Brad

 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabe Black
 Sent: Monday, February 07, 2011 1:59 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] changeset in m5: regress: Regression Tester output
 updates

 Yeah, unfortunately some of those files we can't distribute, but I'm pretty
 sure the ARM Linux kernel we can. As we discussed before it would be ideal
 to move away from the regressions that need files to run that we can't
 actually give people, but that's likely going to be a lot of work.

 In any case, the regressions reran and I have an update. I went through all
 the diffs and saw lots of what I expected (new stats, different host stats,
 minor config.ini changes, different paths to things) but I also saw a few
 regressions with unexpected (by me) differences. These were:

 tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby
 tests/quick/00.hello/ref/alpha/tru64/simple-timing-ruby
 tests/quick/00.hello/ref/mips/linux/simple-timing-ruby
 tests/quick/00.hello/ref/sparc/linux/simple-timing-ruby
 tests/quick/00.hello/ref/x86/linux/simple-timing-ruby
 tests/quick/50.memtest/ref/alpha/linux/memtest-ruby
 tests/quick/60.rubytest/ref/alpha/linux/rubytest-ruby

 The memtest-ruby regression seems to be the most significantly affected
 where the number of ticks is increased by a factor of about 5. The patch I
 made is attached in case anybody wants to go through it. I'd suggest that at
 least a somebody that's familiar with Ruby go through the tests I pointed out
 and verify the changes are what they expected.

 Once one of the Ruby folks (Brad maybe?) lets me know everything is on
 track and nobody has asked otherwise, I'll go ahead and commit this.

 Gabe

 On 02/07/11 09:15, Beckmann, Brad wrote:
  Ugh...sorry about that.  I had to update most of the stats because one of
 Joel's patches added several new stats.  The problem was that I don't have
 the Linux kernel to run the ARM FS regression tests.  Therefore those tests
 didn't run correctly and thus I incorrectly updated those regression output
 files.  A similar problem occurred for the X86_SE  o3 test.
 
  There is no excuse for my incorrect update of these regression output files.
 However, one thing that will help me in the future is making sure that all of
 us have the capability to run all regress tests.  Many of us, including 
 myself,
 don't have log in access to zizzer at Michigan, and thus it is very hard for 
 me
 to reproduce the environment on zizzer, including external file
 dependencies.
 
  Thanks,
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Gabe Black
  Sent: Monday, February 07, 2011 12:47 AM
  To: M5 Developer List
  Subject: Re: [m5-dev] changeset in m5: regress: Regression Tester
  output updates
 
  I'm rolling back this stat update and rerunning/reupdating

Re: [m5-dev] changeset in m5: scons: show sources and targets when building, ...

2011-02-06 Thread Beckmann, Brad
Do people mind if I change the source and target color from Yellow to Green?  I 
typically use a lighter background and the yellow text is very difficult to 
read.  I figure green is more conducive for both lighter and darker backgrounds 
and it keeps the Green Bay Packer theme.  :)

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Steve Reinhardt
 Sent: Friday, January 07, 2011 10:16 PM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] changeset in m5: scons: show sources and targets when
 building, ...

 changeset b5003ac75977 in /z/repo/m5
 details: http://repo.m5sim.org/m5?cmd=changeset;node=b5003ac75977
 description:
   scons: show sources and targets when building, and colorize output.

   I like the brevity of Ali's recent change, but the ambiguity of
   sometimes showing the source and sometimes the target is a little
   confusing.  This patch makes scons typically list all sources and
   all targets for each action, with the common path prefix factored
   out for brevity.  It's a little more verbose now but also more
   informative.

   Somehow Ali talked me into adding colors too, which is a whole
   'nother story.

 diffstat:

  SConstruct |  114 ++-
 -
  src/SConscript |   32 +-
  src/arch/SConscript|2 +-
  src/arch/isa_parser.py |2 -
  src/python/SConscript  |1 +
  src/python/m5/util/terminal.py |  113
 
  6 files changed, 227 insertions(+), 37 deletions(-)

 diffs (truncated from 442 to 300 lines):

 diff -r 9f9e10967912 -r b5003ac75977 SConstruct
 --- a/SConstruct  Tue Jan 04 21:40:49 2011 -0600
 +++ b/SConstruct  Fri Jan 07 21:50:13 2011 -0800
 @@ -1,5 +1,6 @@
  # -*- mode:python -*-

 +# Copyright (c) 2011 Advanced Micro Devices, Inc.
  # Copyright (c) 2009 The Hewlett-Packard Development Company  #
 Copyright (c) 2004-2005 The Regents of The University of Michigan  # All
 rights reserved.
 @@ -120,6 +121,18 @@

  from m5.util import compareVersions, readCommand

 +AddOption('--colors', dest='use_colors', action='store_true')
 +AddOption('--no-colors', dest='use_colors', action='store_false')
 +use_colors = GetOption('use_colors')
 +
 +if use_colors:
 +from m5.util.terminal import termcap elif use_colors is None:
 +# option unspecified; default behavior is to use colors iff isatty
 +from m5.util.terminal import tty_termcap as termcap
 +else:
 +from m5.util.terminal import no_termcap as termcap
 +

 ##
 ##
  #
  # Set up the main build environment.
 @@ -357,7 +370,7 @@
  # the ext directory should be on the #includes path
  main.Append(CPPPATH=[Dir('ext')])

 -def _STRIP(path, env):
 +def strip_build_path(path, env):
  path = str(path)
  variant_base = env['BUILDROOT'] + os.path.sep
  if path.startswith(variant_base):
 @@ -366,29 +379,94 @@
  path = path[6:]
  return path

 -def _STRIP_SOURCE(target, source, env, for_signature):
 -return _STRIP(source[0], env)
 -main['STRIP_SOURCE'] = _STRIP_SOURCE
 +# Generate a string of the form:
 +#   common/path/prefix/src1, src2 - tgt1, tgt2
 +# to print while building.
 +class Transform(object):
 +# all specific color settings should be here and nowhere else
 +tool_color = termcap.Normal
 +pfx_color = termcap.Yellow
 +srcs_color = termcap.Yellow + termcap.Bold
 +arrow_color = termcap.Blue + termcap.Bold
 +tgts_color = termcap.Yellow + termcap.Bold

 -def _STRIP_TARGET(target, source, env, for_signature):
 -return _STRIP(target[0], env)
 -main['STRIP_TARGET'] = _STRIP_TARGET
 +def __init__(self, tool, max_sources=99):
 +self.format = self.tool_color + ( [%8s]  % tool) \
 +  + self.pfx_color + %s \
 +  + self.srcs_color + %s \
 +  + self.arrow_color +  -  \
 +  + self.tgts_color + %s \
 +  + termcap.Normal
 +self.max_sources = max_sources
 +
 +def __call__(self, target, source, env, for_signature=None):
 +# truncate source list according to max_sources param
 +source = source[0:self.max_sources]
 +def strip(f):
 +return strip_build_path(str(f), env)
 +if len(source)  0:
 +srcs = map(strip, source)
 +else:
 +srcs = ['']
 +tgts = map(strip, target)
 +# surprisingly, os.path.commonprefix is a dumb char-by-char string
 +# operation that has nothing to do with paths.
 +com_pfx = os.path.commonprefix(srcs + tgts)
 +com_pfx_len = len(com_pfx)
 +if com_pfx:
 +# do some cleanup and sanity checking on common prefix
 +if com_pfx[-1] == .:
 +# prefix matches all but file extension: ok
 + 

Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol

2011-02-06 Thread Beckmann, Brad
FYI...If my local regression tests are correct.  This patch does not fix all 
the problems with the MESI_CMP_directory protocol.  One of the patches I just 
checked in fixes a subtle bug in the ruby_mem_test.  Fixing this bug, exposes 
more deadlock problems in the MESI_CMP_directory protocol.

To reproduce the regression tester's sequencer deadlock error, set the 
Randomization flag to false in the file configs/example/ruby_mem_test.py  then 
run the following command:

build/ALPHA_SE_MESI_CMP_directory/m5.debug configs/example/ruby_mem_test.py -n 8

Let me know if you have any questions,

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Thursday, January 13, 2011 8:50 PM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol
 
 changeset 8f37a23e02d7 in /z/repo/m5
 details: http://repo.m5sim.org/m5?cmd=changeset;node=8f37a23e02d7
 description:
   Ruby: Fixes MESI CMP directory protocol
   The current implementation of MESI CMP directory protocol is
 broken.
   This patch, from Arkaprava Basu, fixes the protocol.
 
 diffstat:
 
  src/mem/protocol/MESI_CMP_directory-L1cache.sm |  25
 +++--  src/mem/protocol/MESI_CMP_directory-
 L2cache.sm |  25 -
  2 files changed, 35 insertions(+), 15 deletions(-)
 
 diffs (123 lines):
 
 diff -r 7107a2f3e53a -r 8f37a23e02d7
 src/mem/protocol/MESI_CMP_directory-L1cache.sm
 --- a/src/mem/protocol/MESI_CMP_directory-L1cache.sm  Thu Jan 13
 12:30:18 2011 -0800
 +++ b/src/mem/protocol/MESI_CMP_directory-L1cache.sm  Thu Jan 13
 22:17:11 2011 -0600
 @@ -70,6 +70,7 @@
 
  M_I, desc=L1 replacing, waiting for ACK;
  E_I, desc=L1 replacing, waiting for ACK;
 +SINK_WB_ACK, desc=This is to sink WB_Acks from L2;
 
}
 
 @@ -749,9 +750,8 @@
  l_popRequestQueue;
}
 
 -  transition(M_I, Inv, I) {
 +  transition(M_I, Inv, SINK_WB_ACK) {
  ft_sendDataToL2_fromTBE;
 -s_deallocateTBE;
  l_popRequestQueue;
}
 
 @@ -766,16 +766,14 @@
  l_popRequestQueue;
}
 
 -  transition(M_I, Fwd_GETX, I) {
 +  transition(M_I, Fwd_GETX, SINK_WB_ACK) {
  dt_sendDataToRequestor_fromTBE;
 -s_deallocateTBE;
  l_popRequestQueue;
}
 
 -  transition(M_I, {Fwd_GETS, Fwd_GET_INSTR}, I) {
 +  transition(M_I, {Fwd_GETS, Fwd_GET_INSTR}, SINK_WB_ACK) {
  dt_sendDataToRequestor_fromTBE;
  d2t_sendDataToL2_fromTBE;
 -s_deallocateTBE;
  l_popRequestQueue;
}
 
 @@ -865,6 +863,21 @@
  s_deallocateTBE;
  o_popIncomingResponseQueue;
}
 +
 +  transition(SINK_WB_ACK, {Load, Store, Ifetch, L1_Replacement}){
 +  z_recycleMandatoryQueue;
 +
 +  }
 +
 +  transition(SINK_WB_ACK, Inv){
 +fi_sendInvAck;
 +l_popRequestQueue;
 +  }
 +
 +  transition(SINK_WB_ACK, WB_Ack){
 +s_deallocateTBE;
 +o_popIncomingResponseQueue;
 +  }
  }
 
 
 diff -r 7107a2f3e53a -r 8f37a23e02d7
 src/mem/protocol/MESI_CMP_directory-L2cache.sm
 --- a/src/mem/protocol/MESI_CMP_directory-L2cache.sm  Thu Jan 13
 12:30:18 2011 -0800
 +++ b/src/mem/protocol/MESI_CMP_directory-L2cache.sm  Thu Jan 13
 22:17:11 2011 -0600
 @@ -734,11 +734,13 @@
// BASE STATE - I
 
// Transitions from I (Idle)
 -  transition({NP, IS, ISS, IM, SS, M, M_I, MT_I, MCT_I, I_I, S_I, SS_MB,
 M_MB, MT_IIB, MT_IB, MT_SB}, L1_PUTX) {
 +  transition({NP, IS, ISS, IM, SS, M, M_I, I_I, S_I, M_MB, MT_IB, MT_SB},
 L1_PUTX) {
 +t_sendWBAck;
  jj_popL1RequestQueue;
}
 
 -  transition({NP, SS, M, MT, M_I, MT_I, MCT_I, I_I, S_I, IS, ISS, IM, SS_MB,
 M_MB, MT_IIB, MT_IB, MT_SB}, L1_PUTX_old) {
 +  transition({NP, SS, M, MT, M_I, I_I, S_I, IS, ISS, IM, M_MB, MT_IB,
 MT_SB}, L1_PUTX_old) {
 +t_sendWBAck;
  jj_popL1RequestQueue;
}
 
 @@ -968,6 +970,10 @@
  mmu_markExclusiveFromUnblock;
  k_popUnblockQueue;
}
 +
 +  transition(MT_IIB, {L1_PUTX, L1_PUTX_old}){
 +zz_recycleL1RequestQueue;
 +  }
 
transition(MT_IIB, Unblock, MT_IB) {
  nnu_addSharerFromUnblock;
 @@ -1015,21 +1021,22 @@
  o_popIncomingResponseQueue;
}
 
 +  transition(MCT_I,  {L1_PUTX, L1_PUTX_old}){
 +zz_recycleL1RequestQueue;
 +  }
 +
// L1 never changed Dirty data
transition(MT_I, Ack_all, M_I) {
  ct_exclusiveReplacementFromTBE;
  o_popIncomingResponseQueue;
}
 
 -
 -  // drop this because L1 will send data again
 -  //  the reason we don't accept is that the request virtual network may be
 completely backed up
 -  // transition(MT_I, L1_PUTX) {
 -  //   jj_popL1RequestQueue;
 -  //}
 +  transition(MT_I, {L1_PUTX, L1_PUTX_old}){
 +zz_recycleL1RequestQueue;
 +  }
 
// possible race between unblock and immediate replacement
 -  transition(MT_MB, {L1_PUTX, L1_PUTX_old}) {
 +  transition({MT_MB,SS_MB}, {L1_PUTX, L1_PUTX_old}) {
  zz_recycleL1RequestQueue;
}
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 

Re: [m5-dev] PerfectSwitch

2011-02-03 Thread Beckmann, Brad
Hi Nilay,

Yes, you could make such an optimization, but you want to be careful not to 
introduce starvation.  You want to make sure that newly arriving messages are 
not always prioritized over previously stalled messages.

Could you avoid looping through all message buffers by creating a list of ready 
messages and simply scanning that instead?  You still want to store the 
messages in the message buffers because they model the virtual channel storage. 
 However, the list can be what the wakeup function actually scans.

Does that make sense to you, or am I overlooking something?

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Thursday, February 03, 2011 10:23 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] PerfectSwitch
 
 On Thu, 3 Feb 2011, Nilay Vaish wrote:
 
  I implemented this approach. But it did not improve the performance.
  So I tried to explore what could be the cause. The function
  PerfectSwitch::wakeup() contains three loops.
 
  loop on number of virtual networks
   loop on number of incoming links
 loop till all messages for this (link, network) have been routed
 
  I am working with an 8 processor mesh network and run
  ruby_random_test.py for
  400,000 loads. About 11-12% of the time is taken by this function,
  which is the highest amongst all the functions. I moved the third loop
  to another function. I found that the wakeup function is itself called
  about 76,000,000 times, number of messages processed is about
  81,000,000. Out of these about
  71,000,000 have destination count = 1. Surprisingly the inner loop,
  that I had separated out as a function, was called 3,600,000,000
  times. That is about 45 times per invocation of the wakeup function,
  when each invocation of the wakeup function processes just about one
 message.
 
  When is the wakeup function called? Is it called in a periodic
  fashion? Or when a message needs to routed? Is it possible that
  instead of looking at all the virtual networks and links, we look at
  only those that have messages that need routing?
 
 
 I found that wakeup is scheduled only when a message needs to be routed.
 This is done using the consumer pointer. So, we need to some how inform
 the switch when ever wake up event happens, following link, networks
 need to be looked at. But this would mean a change in the Consumer class
 and in the RubyEvent class. Should we add a new parameter to the
 scheduling function, which would be some information that the wakeup
 function receives?
 
 --
 Nilay
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: ruby: support to stallAndWait the mandatory queue

2011-01-23 Thread Beckmann, Brad
Thanks Arka for that response.  You summed it up well.

There are just a couple additional things I want to point out:


1.   One thing that makes this mechanism work is that one must rank each 
input port.   In other words, the programmer must understand and communicate 
the dependencies between message classes/protocol virtual channels.  That way 
the correct messages are woken up when the appropriate event occurs.

2.   In Nilay's example, you want to make sure that you don't delay the 
issuing of request A until the replacement of block B completes.  Instead, 
request A should allocate a TBE and issue in parallel with replacing B. The 
mandatory queue is popped only when the cache message is consumed.  When the 
cache message is stalled, it is basically moved to a temporary data structure 
with the message buffer where it waits until a higher priority message of the 
same cache block wakes it up.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Arkaprava Basu
Sent: Saturday, January 22, 2011 10:49 AM
To: M5 Developer List
Cc: Gabe Black; Ali Saidi
Subject: Re: [m5-dev] Review Request: ruby: support to stallAndWait the 
mandatory queue

Hi Nilay,

You are mostly correct. I believe this patch contains two things

1. Support in SLICC to allow waiting and stalling on messages in message buffer 
when the directory is in blocking state for that address (i.e. can not 
process the message at this point),  until some event occurred that can make 
consumption of the message possible. When the directory unblocks, it provides 
the support for waking up the messages that were hitherto waiting (this is the 
precise reason why u did not see pop of mandatory queue, but see 
WakeUpAllDependants).

2. It contains changes to MOESI_hammer protocol that leverages this support.

For the purpose of this particular discussion, the 1st part is the relevant one.

As far as I understand, the support in SLICC for waiting and stalling was 
introduced primarily to enhance fairness in the way SLICC handles the 
coherence requests. Without this support when a message arrives to a controller 
in blocking state, it recycles, which means it polls again (and thus looks up 
again) in 10 cycles (generally recycle latency is set to 10). If there are 
multiple messages arrive while the controller was blocking state for a given 
address, you can easily see that there is NO fairness. A message that arrived 
latest for the blocking address can be served first when the controller 
unblocks. With the new support for stalling and waiting, the blocked messages 
are put in a FIFO queue and thus providing better fairness.
But as you have correctly guessed, another major advantage of this support is 
that it reduces unnecessary lookups to the cache structure that happens due 
to polling (a.k.a recycle).  So in summary, I believe that the problem you 
are seeing with too many lookups will *reduce* when the protocols are adjusted 
to take advantage of this facility. On related note, I should also mention that 
another fringe benefit of this support is that it helps in debugging coherence 
protocols. With this, coherence protocol traces won't contains thousands of 
debug messages for recycling, which can be pretty annoying for the protocol 
writers.

I hope this helps,

Thanks
Arka



On 01/22/2011 06:40 AM, Nilay Vaish wrote:



---

This is an automatically generated e-mail. To reply, visit:

http://reviews.m5sim.org/r/408/#review797

---





I was thinking about why the ratio of number of memory lookups, as reported by 
gprof,

and the number of memory references, as reported in stats.txt.



While I was working with the MESI CMP directory protocol, I had seen that the 
same

request from the processor is looked up again and again in the cache, if the 
request

is waiting for some event to happen. For example, suppose a processor asks for 
loading

address A, but the cache has no space for holding address A. Then, it will give 
up

some cache block B before it can bring in address A.



The problem is that while the cache block B is being given, it is possible that 
the

request made for address A is looked up in the cache again, even though we know 
it

is not possible that we would find it in the cache. This is because the 
requests in

the mandatory queue are recycled till they get done with.



Clearly, we should remove the request for bringing in address A to a separate 
structure,

instead of looking it up again and again. The new structure should be looked up 
whenever

an event, that could possibly affect the status of this request, occurs. If we 
do this,

then I think we should see a further reduction in the number of lookups. I 
would expect

almost 90% of the lookups to the cache to go away. This should also mean a 5% 
improvement

in simulator performance.



Brad, do agree 

Re: [m5-dev] Error in Simulating Mesh Network

2011-01-23 Thread Beckmann, Brad
Yes, but right now my repo is a couple weeks behind the main repo and I'd 
rather get all these patches resolved first, then sync up with main repo and do 
my final regression testing once.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabe Black
 Sent: Saturday, January 22, 2011 2:26 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Error in Simulating Mesh Network
 
 You should be able to move that around any other patches ahead of it, right?
 It's so simple I wouldn't expect it to really depend on the intervening
 patches.
 
 Gabe
 
 Beckmann, Brad wrote:
  Hi Nilay,
 
  Yes, I am aware of this problem and one of the patches
 (http://reviews.m5sim.org/r/381/) I'm planning to check in does fix this.
 Unfortunately, those patches are being hung up because I need to do some
 more work on another one of them and right now I don't have any time to do
 so.   As you can see from the patch, it is a very simple fix, so you may want 
 to
 do it locally if it blocking you.
 
  Brad
 
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-
 boun...@m5sim.org] On
  Behalf Of Nilay Vaish
  Sent: Thursday, January 20, 2011 6:16 AM
  To: m5-dev@m5sim.org
  Subject: [m5-dev] Error in Simulating Mesh Network
 
  Brad, I tried simulating a mesh network with four processors.
 
  ./build/ALPHA_FS_MOESI_hammer/m5.prof
 ./configs/example/ruby_fs.py -
  -maxtick 2000 -n 4 --topology Mesh --mesh-rows 2
  --num-l2cache 4 --num-dir 4
 
  I receive the following error:
 
  panic: FIFO ordering violated: [MessageBuffer:  consumer-yes [
  [71227521, 870, 1; ] ]] [Version 1, L1Cache, triggerQueue_in]
name: [Version 1, L1Cache, triggerQueue_in] current time: 71227512
 delta:
  1 arrival_time: 71227513 last arrival_time: 71227521
@ cycle 35613756000
 
 [enqueue:build/ALPHA_FS_MOESI_hammer/mem/ruby/buffers/MessageB
  uffer.cc,
  line 198]
 
  Do you think that the options I have specified should work correctly?
 
  Thanks
  Nilay
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Error in Simulating Mesh Network

2011-01-20 Thread Beckmann, Brad
Hi Nilay,

Yes, I am aware of this problem and one of the patches 
(http://reviews.m5sim.org/r/381/) I'm planning to check in does fix this.  
Unfortunately, those patches are being hung up because I need to do some more 
work on another one of them and right now I don't have any time to do so.   As 
you can see from the patch, it is a very simple fix, so you may want to do it 
locally if it blocking you.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Thursday, January 20, 2011 6:16 AM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] Error in Simulating Mesh Network
 
 Brad, I tried simulating a mesh network with four processors.
 
 ./build/ALPHA_FS_MOESI_hammer/m5.prof ./configs/example/ruby_fs.py -
 -maxtick 2000 -n 4 --topology Mesh --mesh-rows 2 --num-l2cache 4
 --num-dir 4
 
 I receive the following error:
 
 panic: FIFO ordering violated: [MessageBuffer:  consumer-yes [ [71227521,
 870, 1; ] ]] [Version 1, L1Cache, triggerQueue_in]
   name: [Version 1, L1Cache, triggerQueue_in] current time: 71227512 delta:
 1 arrival_time: 71227513 last arrival_time: 71227521
   @ cycle 35613756000
 [enqueue:build/ALPHA_FS_MOESI_hammer/mem/ruby/buffers/MessageB
 uffer.cc,
 line 198]
 
 Do you think that the options I have specified should work correctly?
 
 Thanks
 Nilay
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] (no subject)

2011-01-19 Thread Beckmann, Brad
Hi Nilay,

Yes, that is correct.  There is a comment at the top of the file: 
src/mem/ruby/network/topologies/Mesh.py which says that very thing:

# Makes a generic mesh assuming an equal number of cache and directory cntrls

The function ensures that only the number of dma controllers is not a multiple 
of the number of routers.  If you know of a better way to handle this, please 
let me know.

Brad


 -Original Message-
 From: Nilay [mailto:ni...@cs.wisc.edu]
 Sent: Tuesday, January 18, 2011 9:28 PM
 To: Beckmann, Brad
 Cc: m5-dev@m5sim.org
 Subject: RE:
 
 Brad,
 
 I got the simulation working. It seems to me that you wrote Mesh.py under
 the assumption that number of cpus = number of L1 controllers = number of
 L2 controllers (if present) = number of directory controllers.
 
 The following options worked after some struggle and some help from Arka -
 
 ./build/ALPHA_FS_MESI_CMP_directory/m5.fast
 ./configs/example/ruby_fs.py --maxtick 20 -n 16 --topology Mesh --
 mesh-rows 4 --num-dirs 16 --num-l2caches 16
 
 --
 Nilay
 
 
 On Tue, January 18, 2011 10:28 am, Beckmann, Brad wrote:
  Hi Nilay,
 
  My plan is to tackle the functional access support as soon as I check
  in our current group of outstanding patches.  I'm hoping to at least
  check in the majority of them in the next couple of days.  Now that
  you've completed the CacheMemory access changes, you may want to
  re-profile GEM5 and make sure the next performance bottleneck is
  routing network messages in the Perfect Switch.  In particular, you'll
  want to look at rather large (16+ core) systems using a standard Mesh
  network.  If you have any questions on how to do that, Arka may be
  able to help you out, if not, I can certainly help you.  Assuming the
  Perfect Switch shows up as a major bottleneck ( 10%),  then I would
  suggest that as the next area you can work on.  When looking at
  possible solutions, don't limit yourself to just changes within
  Perfect Switch itself.  I suspect that redesigning how destinations
  are encoded and/or the interface between MessageBuffer dequeues and
 the PerfectSwitch wakeup, will lead to a better solution.
 
  Brad
 
 
  -Original Message-
  From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
  Sent: Tuesday, January 18, 2011 6:59 AM
  To: Beckmann, Brad
  Cc: m5-dev@m5sim.org
  Subject:
 
  Hi Brad
 
  Now that those changes to CacheMemory, SLICC and protocol files have
  been pushed in, what's next that you think we should work on? I was
  going through some of the earlier emails. You have mentioned
  functional access support in Ruby, design of the Perfect Switch,
  consolidation of stat files.
 
  Thanks
  Nilay
 
 
 


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] (no subject)

2011-01-18 Thread Beckmann, Brad
Hi Nilay,

My plan is to tackle the functional access support as soon as I check in our 
current group of outstanding patches.  I'm hoping to at least check in the 
majority of them in the next couple of days.  Now that you've completed the 
CacheMemory access changes, you may want to re-profile GEM5 and make sure the 
next performance bottleneck is routing network messages in the Perfect Switch.  
In particular, you'll want to look at rather large (16+ core) systems using a 
standard Mesh network.  If you have any questions on how to do that, Arka may 
be able to help you out, if not, I can certainly help you.  Assuming the 
Perfect Switch shows up as a major bottleneck ( 10%),  then I would suggest 
that as the next area you can work on.  When looking at possible solutions, 
don't limit yourself to just changes within Perfect Switch itself.  I suspect 
that redesigning how destinations are encoded and/or the interface between 
MessageBuffer dequeues and the PerfectSwitch wakeup, will lead to a b
 etter solution.

Brad


 -Original Message-
 From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
 Sent: Tuesday, January 18, 2011 6:59 AM
 To: Beckmann, Brad
 Cc: m5-dev@m5sim.org
 Subject:
 
 Hi Brad
 
 Now that those changes to CacheMemory, SLICC and protocol files have
 been pushed in, what's next that you think we should work on? I was going
 through some of the earlier emails. You have mentioned functional access
 support in Ruby, design of the Perfect Switch, consolidation of stat files.
 
 Thanks
 Nilay


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Question on SLICC

2011-01-18 Thread Beckmann, Brad
Nilay,

Are you trying to replace CacheMsg with RubyRequest?  I agree that we can 
probably get rid of one of them.  If I recall, right now RubyRequest is defined 
in libruby.hh.  Is the Ruby library interface still important to you all at 
Wisconsin?  If not, I would like to get rid of the libruby files.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Tuesday, January 18, 2011 10:45 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Question on SLICC
 
 Figured that out last night. I also noticed that there is comment about it in
 RubySlicc_Types.sm (should read files more carefully). Actually, I am trying 
 to
 get rid of CacheMsg class. Currently, RubyRequest is created from packet
 (which I believe is an m5 primitive) and then a CacheMsg is created from
 RubyRequest.
 
 Thanks
 Nilay
 
 
 On Tue, 18 Jan 2011, nathan binkert wrote:
 
  There are certain types defined in the file
  src/mem/protocol/RubySlicc_Types.sm. For each of the type is .hh is
  gets written which contains the path of the actual header file to be
  used. For example, the file RubySlicc_Types.sm defines CacheMemory
  type. This type is actually defined in the file
  src/mem/ruby/system/CacheMemory.hh. When a protocol is compiled,
 the
  file build/protocol_name/mem/protocol/CacheMemory.hh gets
 written.
  This file contains just one line - #include path to
  CacheMemory.hh
 
  My question is which script writes this file. I have looked around
  but have not been able to figure it out yet.
 
  That gets done in src/mem/ruby/SConscript.  The reason it gets done
  there is because the .hh file is actually in the system directory, but
  the way the slicc code is generated, it tries to include it from the
  protocol directory.  In the original slicc/ruby, this didn't matter
  because all directories were in the include search path, but in M5 we
  need to know the path.  There was no easy way to fix this, so this
  ugly band aid exists.  Be awesome to get rid of it.
 
   Nate
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] EIO Regression Tests

2011-01-17 Thread Beckmann, Brad
Hi Nilay,

I understand your confusion.  This is an example of where the wiki needs to be 
updated.  I believe the wiki only mentions the encumbered tar ball and doesn't 
mention the encumbered hg repo on repo.m5sim.org.  As far as the anagram test 
program goes, I remember Lisa and I encountered the same issue a while back and 
to resolve it I believe Lisa copied that test along with several other 
regression tester programs from Michigan to AMD.

I can provide you those regression tester programs, but at a higher level, I 
think this is a good time to ask the question on how we want to provide 
external users all the files necessary to run the regression tester?  As Nilay 
points out, the encumbered repo has some, but not all of the necessary files.  
I believe, one also needs another set of regression tester programs which 
include both the anagram files, as well as the SPECCPU files for the long 
regression tester runs.

Thoughts?

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay Vaish
 Sent: Monday, January 17, 2011 1:55 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] EIO Regression Tests
 
 I figured that out, but there is no anagram directory in tests/test-
 progs.
 I, therefore, receive the following error:
 
 gzip: tests/test-progs/anagram/bin/alpha/eio/anagram-vshort.eio.gz: No
 such file or directory
 
 --
 Nilay
 
 On Mon, 17 Jan 2011, Steve Reinhardt wrote:
 
  The one where the EIO code lives.  That's it's name, at
  http://repo.m5sim.org.
 
  On Mon, Jan 17, 2011 at 12:59 PM, Nilay Vaish ni...@cs.wisc.edu
 wrote:
 
  What do you mean by the encumbered repository?
 
 
  On Mon, 17 Jan 2011, Steve Reinhardt wrote:
 
   Yes, it should be a concern... it should work.  Did you do a pull
 on the
  encumbered repository?  There were some changes there needed to
 maintain
  compatibility with the latest m5 dev repo.
 
  Otherwise you'll need to provide more detail about how things
 failed.
 
  Steve
 
  On Mon, Jan 17, 2011 at 10:21 AM, Nilay Vaish ni...@cs.wisc.edu
 wrote:
 
   I just ran the regression tests for the patch (deals with SLICC
 and cache
  coherence protocols) that I need to commit. The EIO tests fail.
 Should
  this
  be a concern?
 
  --
  Nilay
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
   ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] EIO Regression Tests

2011-01-17 Thread Beckmann, Brad
Thanks Gabe.

I had completely forgotten about the fact we can freely distribute some of 
those tests.  You're suggestion on creating a second, shorter regression tester 
that focuses on testing different mechanisms sounds like a great idea.  
Hopefully we can get that done sometime.

In the meantime, let's just make a note to update the wiki in the near future 
on the current procedure for running the regression tester, pointing people to 
the binaries that we can't distribute ourselves.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Gabriel Michael Black
 Sent: Monday, January 17, 2011 4:23 PM
 To: m5-dev@m5sim.org
 Subject: Re: [m5-dev] EIO Regression Tests
 
 I think there are two important aspects of this issue.
 
 1. Using regression tests we can't distribute freely has some
 important limitations. It would be nice to replace them with ones we
 can.
 
 2. The majority of the regression tests we have now are really
 benchmarks which provide basic coverage by working/not working and not
 changing behavior unexpectedly. That's an important element to have
 since it's a practical reality check and probably hits things we
 wouldn't think to test. They have significant limitations, though,
 since they take a long time to run and tend to exercise the same
 simulator functionality over and over. For instance, gcc may generate
 code that always has the same type of backward branch for a for loop.
 Using gzip as a test will verify that that branch works, but possibly
 not the slightly different variant that may, for instance, use a large
 branch displacement. Even when writing code in x86 assembly it can be
 impossible to predict which of the possibly many redundant instruction
 encodings the assembler might pick.
 
 So, in everyone's infinite free time, I think we should replace our
 benchmark based regressions with a smaller set of freely distributable
 regressions/inputs, and augment them with shorter, targeted tests that
 exercise particular mechanisms, circumstances, instructions, etc.
 Instead of replacing our existing benchmarks which are useful as
 actual benchmarks and are good to keep working, we could build up this
 second set of tests in parallel.
 
 Gabe
 
 Quoting Beckmann, Brad brad.beckm...@amd.com:
 
  Hi Nilay,
 
  I understand your confusion.  This is an example of where the wiki
  needs to be updated.  I believe the wiki only mentions the
  encumbered tar ball and doesn't mention the encumbered hg repo on
  repo.m5sim.org.  As far as the anagram test program goes, I remember
  Lisa and I encountered the same issue a while back and to resolve it
  I believe Lisa copied that test along with several other regression
  tester programs from Michigan to AMD.
 
  I can provide you those regression tester programs, but at a higher
  level, I think this is a good time to ask the question on how we
  want to provide external users all the files necessary to run the
  regression tester?  As Nilay points out, the encumbered repo has
  some, but not all of the necessary files.  I believe, one also needs
  another set of regression tester programs which include both the
  anagram files, as well as the SPECCPU files for the long regression
  tester runs.
 
  Thoughts?
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
  Behalf Of Nilay Vaish
  Sent: Monday, January 17, 2011 1:55 PM
  To: M5 Developer List
  Subject: Re: [m5-dev] EIO Regression Tests
 
  I figured that out, but there is no anagram directory in tests/test-
  progs.
  I, therefore, receive the following error:
 
  gzip: tests/test-progs/anagram/bin/alpha/eio/anagram-vshort.eio.gz:
 No
  such file or directory
 
  --
  Nilay
 
  On Mon, 17 Jan 2011, Steve Reinhardt wrote:
 
   The one where the EIO code lives.  That's it's name, at
   http://repo.m5sim.org.
  
   On Mon, Jan 17, 2011 at 12:59 PM, Nilay Vaish ni...@cs.wisc.edu
  wrote:
  
   What do you mean by the encumbered repository?
  
  
   On Mon, 17 Jan 2011, Steve Reinhardt wrote:
  
Yes, it should be a concern... it should work.  Did you do a
 pull
  on the
   encumbered repository?  There were some changes there needed to
  maintain
   compatibility with the latest m5 dev repo.
  
   Otherwise you'll need to provide more detail about how things
  failed.
  
   Steve
  
   On Mon, Jan 17, 2011 at 10:21 AM, Nilay Vaish
 ni...@cs.wisc.edu
  wrote:
  
I just ran the regression tests for the patch (deals with SLICC
  and cache
   coherence protocols) that I need to commit. The EIO tests fail.
  Should
   this
   be a concern?
  
   --
   Nilay
   ___
   m5-dev mailing list
   m5-dev@m5sim.org
   http://m5sim.org/mailman/listinfo/m5-dev
  
  
___
   m5-dev mailing list
   m5-dev@m5sim.org
   http://m5sim.org/mailman/listinfo/m5-dev

Re: [m5-dev] MOESI_CMP_token

2011-01-14 Thread Beckmann, Brad
Hi Nilay,

There is often a tradeoff between doing operations in actions versus the input 
port.  Overall, I agree with you that we should concentrate on doing most/all 
operations in actions, not the input ports.  The input port logic is often a 
confusing nested conditional mess and performing other operations inside the 
input port logic only further confuses things.  I believe the reason why the 
ExternalResponse is monitored at the input port for the token protocol, is 
because this is a critical piece of information needed for tuning the dynamic 
timeout latency.  It is likely that Mike Marty (who I believe is the original 
author) just wanted to make sure he always correctly identified external 
responses.

My suggestion for you is not to worry about it and just keep the logic as is.  
There is no need to give yourself extra work.  To my knowledge GEM5 has yet to 
be configured into multiple chips and most of the ExternalResponse logic deals 
with separating local cache hits vs. remote cache hits.  Once we configure 
multiple chip systems, we can revisit the ExternalResponse logic and possibly 
optimize it.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay
 Sent: Friday, January 14, 2011 1:12 AM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] MOESI_CMP_token
 
 I am trying to update the MOESI CMP token protocol. Line 563 in the
 file
 for the L1 cache controller caught my eye. While processing a message
 received through the response network, the transaction buffer entry for
 the address is edited.
 tbe.ExternalResponse := true;
 
 Should this happen where it is happening currently? I think this change
 should appear in some action.
 
 --
 Nilay
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Updating MOESI CMP Directory protocol as per the new interface

2011-01-13 Thread Beckmann, Brad
Hi Nilay,

Yes, please add the OOD token.  I believe that will come in handy when 
developing new protocols.  Don’t worry about separating out that 
RequestorMachine change.  It seems like just a few extra lines.  Also I believe 
the MOESI_CMP_Directory protocol did work correctly before your change, right?  
If so, the RequestorMachine lines are related to the rest of the patch.

Brad


From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
Sent: Thursday, January 13, 2011 8:57 AM
To: Nilay Vaish; Default; Beckmann, Brad
Subject: Re: Review Request: Updating MOESI CMP Directory protocol as per the 
new interface

This is an automatically generated e-mail. To reply, visit: 
http://reviews.m5sim.org/r/359/



On January 13th, 2011, 8:48 a.m., Brad Beckmann wrote:
src/mem/protocol/MOESI_CMP_directory-L1cache.smhttp://reviews.m5sim.org/r/359/diff/8/?file=9537#file9537line159
 (Diff revision 8)


155


if (L1DcacheMemory.isTagPresent(addr)) {

157


return L1Icache_entry;


So the assumption here is the L1IcacheMemory.lookup() call either returns the 
L1I cache entry or NULL/OOD, correct?  Does SLICC also support explicitly 
passing back OOD?

Currently, SLICC does not have support for Out Of Domain (OOD) token. But I can 
add that as I had done earlier. I am not sure if we actually need it.


On January 13th, 2011, 8:48 a.m., Brad Beckmann wrote:
src/mem/protocol/MOESI_CMP_directory-L1cache.smhttp://reviews.m5sim.org/r/359/diff/8/?file=9537#file9537line465
 (Diff revision 8)



430


out_msg.RequestorMachine := MachineType:L1Cache;


This seems like an unrelated change, correct.  However it is pretty minor, so 
don't worry about it.

IIRC, this is necessary or else a certain panic state is reached. I think I 
should separately make this change.


- Nilay


On January 12th, 2011, 10:44 p.m., Nilay Vaish wrote:
Review request for Default.
By Nilay Vaish.

Updated 2011-01-12 22:44:50

Description

This is a request for reviewing the proposed changes to the MOESI CMP directory 
cache coherence protocol to make it conform with the new cache memory interface 
and changes to SLICC.


Testing

These changes have been tested using the Ruby random tester. The tester was 
used with -l = 1048576 and -n = 2.


Diffs

 *   src/mem/protocol/MOESI_CMP_directory-L1cache.sm (c6bc8fe81e79)
 *   src/mem/protocol/MOESI_CMP_directory-L2cache.sm (c6bc8fe81e79)
 *   src/mem/protocol/MOESI_CMP_directory-dir.sm (c6bc8fe81e79)
 *   src/mem/protocol/MOESI_CMP_directory-dma.sm (c6bc8fe81e79)

View Diffhttp://reviews.m5sim.org/r/359/diff/


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Checkpoint Tester Problems

2011-01-13 Thread Beckmann, Brad
Well I just realized that I don't have permissions to add new bug reports to 
Flyspray.  My Flyspray user id is beckmabd if anyone would like to grant me 
permissions.  Thanks!

The checkpoint tester is a script located in util/checkpoint_test.py that Ali 
recently pointed me to.  The script is commented well and fully describes what 
it does and how to run it.  When I run a small test using X86_FS, the script 
identifies the following mismatches:

Cmd: util/checkpoint-tester.py -i 2000 -- build/ALPHA_FS_MOESI_hammer/m5.debug 
configs/example/fs.py --script test/halt.sh

Diff output:
--- checkpoint-test/m5out/cpt.1/m5.cpt  Wed Jan 12 14:59:28 2011
+++ checkpoint-test/test.4/cpt.1/m5.cpt Wed Jan 12 15:00:42 2011
@@ -10,20 +10,20 @@
 so_state=2
 locked=false
 _status=1
-instCnt=10
+instCnt=9
 
 [system.cpu.xc.0]
 _status=0
-funcExeInst=16
+funcExeInst=15
 quiesceEndTick=0
 iplLast=0
 iplLastTick=0
 floatRegs.i=0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0
 0 0 0 0
-intRegs=549755813888 0 2097152 0 0 0 590336 0 0 0 0 0 0 0 0 0 0 2097208 380 0 
0 0 0 2097189 0 0 0 0
 0 0 0 0 133 0 0 0 0 0 0
+intRegs=549755813888 0 2097152 0 0 0 590336 0 0 0 0 0 0 0 0 0 
18446743523955834880 2097182 380 0 0 
0 0 2097189 0 0 0 0 0 0 0 0 133 0 0 0 0 0 0
 _pc=2097202
-_npc=2097208
-_upc=1
-_nupc=2
+_npc=2097210
+_upc=0
+_nupc=1
 regVal=3758096401 0 0 458752 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4294905840 
1024 2 243392 0 1288 0
 0 0 260 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 
1974748653749254 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1280 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 132609 
0 0 0 0 67108864 0 0 0 0 0 16 8 16 16 16 16 0 0 0 0 0 24 0 0 0 0 0 0 0 0 0 
483328 0 0 0 0 0 0 0 0 0 
0 0 0 483328 0 0 0 0 983295 983295 983295 983295 983295 983295 65535 65535 23 
65535 65535 983295 655
35 45768 43728 45768 45768 45768 45768 45952 0 45952 45952 45952 43976 45952 0 
0 0 0 0 0 0 0 0 0 0 4
276095232 0
 
 [system.cpu.tickEvent]


By the way, could we add this test to the regression tester?

Brad



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Gabe Black
 Sent: Wednesday, January 12, 2011 4:42 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Checkpoint Tester Problems
 
 Flyspray would be good. We don't use it like we should, but it's probably the
 most appropriate place. I'm not familiar with the checkpoint tester. How
 does it work (link to the wiki would be fine), and what were the differences?
 
 Gabe
 
 Beckmann, Brad wrote:
 
  Hi All,
 
 
 
  While using the checkpoint tester script, I noticed that at least
  X86_FS with the atomic + classic memory system encounters differences
  in the checkpoint state.  The good news is that none of the patches I
  have out for review add any more checkpoint differences, but we still
  should track down the existing bugs at some point.  Should I use
  flyspray to document the bugs, or would you prefer me to document
  these bugs some other way?
 
 
 
  Thanks,
 
 
 
  Brad
 
 
 
  --
  --
 
  ___
  m5-dev mailing list
  m5-dev@m5sim.org
  http://m5sim.org/mailman/listinfo/m5-dev
 
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

2011-01-12 Thread Beckmann, Brad
 
 Are you sure you would call the above piece of code as __implicit__ setting
 of cache and tbe entry variables? In this case, the local variable has been
 __explicitly__ passed in the call to the trigger function.
 
 To me 'X is implicit' means that the programmer does not need to write 'X'
 in the protocol file for the compiler. For example, currently trigger function
 implicitly passes the state of the address as a parameter.
 
 Such code is possible, my only concern is that once the variable is set, it
 cannot be used again on the left hand side of the assignment operator.
 
 Entry local_var := getL1ICacheEntry(in_msg.LineAddress)
 /* Do some thing*/
 local_var := getL1DCacheEntry(in_msg.LineAddress)
 
 This SLICC code will not generate correct C++ code, since we assume that a
 pointer variable can only be used in its dereferenced form, except when
 passed on as a parameter in a function call.
 

Yeah, I think we were confusing each other before because implicit was meaning 
different things.  When I said implicitly passes the cache entry, I meant that 
relative to the actions, not the trigger function.  As you mentioned, the input 
state is an implicit parameter to the trigger function, but the address is an 
explicit parameter to the trigger function and an implicit parameter to the 
actions.  You were thinking the former and we were thinking the latter.  Now I 
think we are on the same page.

Actually I was thinking that we only dereference the cache_entry pointer when 
we reference a member of the class.  I haven't thought this all the way 
through, but is that possible?  Then such an assignment would work.

Brad


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


[m5-dev] Checkpoint Tester Problems

2011-01-12 Thread Beckmann, Brad
Hi All,

While using the checkpoint tester script, I noticed that at least X86_FS with 
the atomic + classic memory system encounters differences in the checkpoint 
state.  The good news is that none of the patches I have out for review add any 
more checkpoint differences, but we still should track down the existing bugs 
at some point.  Should I use flyspray to document the bugs, or would you prefer 
me to document these bugs some other way?

Thanks,

Brad

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

2011-01-11 Thread Beckmann, Brad
Hi Nilay,

Sure, using a local variable to further reduce the calls to getCacheEntry is a 
great idea.  I think that is orthogonal to the suggestion I was making.  I just 
want the ability to directly set the cache_entry and tbe_entry variables in the 
trigger function.  That way the address, cache_entry, and tbe_entry variables 
are dealt with consistently and it avoids adding the separate calls to 
set_cache_entry() and set_tbe () in the inports.

Brad


-Original Message-
From: Nilay Vaish [mailto:ni...@cs.wisc.edu] 
Sent: Friday, January 07, 2011 11:40 AM
To: Beckmann, Brad
Cc: Default
Subject: RE: Review Request: Changing how CacheMemory interfaces with SLICC

Brad, my comments are inline.


On Fri, 7 Jan 2011, Beckmann, Brad wrote:

 Hi Nilay,



 Unfortunately I can't provide you an example of a protocol where 
 getCacheEntry behaves in a different manner, but they do exist.  I 
 reviewed your most recent patch updates and I don't think what we're 
 asking for is much different than what you have on reviewboard right 
 now.  Basically, all we need to do is add back in the capability for 
 the programmer to write their own getCacheEntry function in the .sm file.
 I know that I initially asked you to automatically generate those 
 functions, and I still think that is useful for most protocols, but 
 Lisa made me realize that we need customized getCacheEntry functions as well.
 Also we may want to change the name of generated getCacheEntry 
 function to getExclusiveCacheEntry so that one realizes the exclusive 
 assumption made by the function.



 Other than that, the only other change I suggest is to allow the trigger 
 function to directly set the implicit cache_entry and tbe_entry 
 variables.  Below is example of what I'm envisioning:


[Nilay] If we do things in this way, then any in_port, in which cache / tb 
entries are accessed before the trigger function, would still make calls 
to isCacheTagPresent().



 Currently in MOESI_CMP_directory-L1cache.sm:



 in_port(useTimerTable_in, Address, useTimerTable) {

if (useTimerTable_in.isReady()) {

set_cache_entry(getCacheEntry(useTimerTable.readyAddress()));

set_tbe(TBEs[useTimerTable.readyAddress()]);

trigger(Event:Use_Timeout, useTimerTable.readyAddress());

}

 }



 Replace that with the following:



 in_port(useTimerTable_in, Address, useTimerTable) {

if (useTimerTable_in.isReady()) {

trigger(Event:Use_Timeout, useTimerTable.readyAddress(),

getExclusiveCacheEntry(useTimerTable.readyAddress()),

   TBEs[useTimerTable.readyAddress()]);

}

 }


[Nilay] Instead of passing cache and tb entries as arguments, we can 
create local variables in the trigger function using the address argument.



 Please let me know if you have any questions.



 Thanks...you're almost done.  :)



 Brad






Thanks
Nilay


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

2011-01-11 Thread Beckmann, Brad
  Sure, using a local variable to further reduce the calls to
  getCacheEntry is a great idea.  I think that is orthogonal to the
  suggestion I was making.  I just want the ability to directly set the
  cache_entry and tbe_entry variables in the trigger function.  That way
  the address, cache_entry, and tbe_entry variables are dealt with
  consistently and it avoids adding the separate calls to
  set_cache_entry() and set_tbe () in the inports.
 
 Firstly, we have to set cache and transaction buffer entry variables whenever
 we do allocation or deallocation of entries. This means these calls cannot be
 completely avoided. Secondly, while processing events from the mandatory
 queue (as it is called in the current implementations), if these variables are
 not set, we will have to revert to the earlier approach. This would double the
 number of times cache entry lookups are performed as the trigger function
 will perform the lookup again. This would also mean that both the
 approaches for looking up cache entry in the cache will have to exist
 simultaneously.
 

Absolutely, we still need the ability to allocate or deallocate entries within 
actions.  I'm not advocating to completely eliminate the set/unset cache and 
tbe entry functions.  I just want to avoid including those calls in the 
inports.  I'm confused why the mandatory queue is different than other queues.  
They all trigger events in the same way.  Maybe I should point out that I'm 
assuming that getCacheEntry can return a NULL pointer and thus that can be 
passed into the trigger call when no cache or tbe entry exists.


 Another concern is in implementation of getCacheEntry(). If this function has
 to return a pointer to a cache entry, we would have to provide support for
 local variables which internally SLICC would assume to be pointer variables.
 

Within SLICC understanding that certain variables are actually pointers is a 
little bit of a nuisance, but there already exists examples where we make that 
distinction.  For instance, look at the if para.pointer conditionals in 
StateMachine.py.  We just have to treat cache and tbe entries in the same 
fashion.


 In my opinion, we should maintain one method for looking up cache entries.
 My own experience informs me that it is not difficult to incorporate calls to
 set/unset_cache_entry () in already existing protocol implementations.
 For implementing new protocols, I think the greater obstacle will be in
 implementing the protocol correctly and not in using entry variables
 correctly. If we document this change lucidly, there is no reason to believe a
 SLICC programmer will be exceptionally pushed because of this change.
 
 Assuming that this change does introduce some complexity in progamming
 with SLICC, does that complexity out weigh the performance improvements?
 

My position is we can leverage SLICC as an intermediate language and achieve 
the performance benefits of your change without significantly impacting the 
programmability.  I agree that we need the set/unset_cache_entry calls in the 
allocate and deallocate actions.  I see no problem with that.  I just want to 
treat these new implicit cache and tbe entry variables like the existing 
implicit variable address.  Therefore I want to pass them into the trigger 
operation like the address variable.  I also want just one method for looking 
up cache entries.  I believe the only difference is that I would like to set 
the cache and tbe entries in the trigger function, as well as allowing them to 
be set in the actions.  

I hope that clarifies at least what I'm envisioning.  I appreciate your 
feedback on this and I want to reiterate that I think your change is really 
close to being done.  If you still feel like I'm missing something, I would be 
happy to chat with you over-the-phone.

Brad



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

2011-01-11 Thread Beckmann, Brad
Hi Nilay,

Overall, I believe we are in more agreement with each other than maybe you 
think.  I'm glad you included pseudo code in your latest email.  That is a 
great idea.  I think part of our problem is we are comprehending our textual 
descriptions in different ways.

Below are my  responses:
 
 
  Absolutely, we still need the ability to allocate or deallocate
  entries within actions.  I'm not advocating to completely eliminate
  the set/unset cache and tbe entry functions.  I just want to avoid
  including those calls in the inports.  I'm confused why the mandatory
  queue is different than other queues.  They all trigger events in the same
 way.
 
 if (L1IcacheMemory.isTagPresent(in_msg.LineAddress)) {
 // The tag matches for the L1, so the L1 asks the L2 for it.
trigger(mandatory_request_type_to_event(in_msg.Type),
 in_msg.LineAddress); }
 
 Brad, mandatory queue is just an example where an inport may perform tag
 lookup before cache and transaction buffer entries has been set. Above is an
 excerpt from the file MOESI_CMP_directory-L1cache.sm. Before the
 trigger() is called, isTagPresent() is called. This means tag look up is being
 performed before cache or transaction buffer entries have been set.
 Suppose the tag was present in L1Icache, then in the trigger() call, we will
 again perform lookup.
 
 Similarly, there is an inport in the Hammer's protocol implementation where
 getCacheEntry() is called before a call to trigger(). Now, why should we use
 getCacheEntry() in the inport and cache entry in the action?
 

The reason is, as you pointed out, we ideally want to call getCacheEntry once.  
I believe your suggestion to use local variables in the input ports gets us 
there.  Below is what I'm envisioning for the MOESI_hammer mandatory queue 
in_port logic (at least the IFETCH half of the logic):

ENTRY getL1ICacheEntry(Address addr) {
   assert(is_valid(L1DcacheMemory.lookup(addr) == FALSE);
   assert(is_valid(L2cacheMemory.lookup(addr) == FALSE);
   return L1IcacheMemory.lookup(addr);
}

ENTRY getL1DCacheEntry(Address addr) {
   assert(is_valid(L1IcacheMemory.lookup(addr) == FALSE);
   assert(is_valid(L2cacheMemory.lookup(addr) == FALSE);
   return L1DcacheMemory.lookup(addr);
}

ENTRY getL2CacheEntry(Address addr) {
   assert(is_valid(L1IcacheMemory.lookup(addr) == FALSE);
   assert(is_valid(L1DcacheMemory.lookup(addr) == FALSE);
   return L2cacheMemory.lookup(addr);
}

  in_port(mandatoryQueue_in, CacheMsg, mandatoryQueue, desc=..., rank=0) {
if (mandatoryQueue_in.isReady()) {
  peek(mandatoryQueue_in, CacheMsg, block_on=LineAddress) {

// Set the local entry variables
ENTRY L1I_cache_entry = getL1ICacheEntry(in_msg.LineAddress);
ENTRY L1D_cache_entry = getL1DCacheEntry(in_msg.LineAddress);
TBE_Entry tbe_entry = getTBE(in_msg.LineAddress);

// Check for data access to blocks in I-cache and ifetchs to blocks in 
D-cache

if (in_msg.Type == CacheRequestType:IFETCH) {
  // ** INSTRUCTION ACCESS ***

  // Check to see if it is in the OTHER L1
  if (is_valid(L1D_cache_entry)) {
// The block is in the wrong L1, try to write it to the L2
if (L2cacheMemory.cacheAvail(in_msg.LineAddress)) {
  trigger(Event:L1_to_L2, in_msg.LineAddress, L1D_cache_entry, 
tbe_entry);
} else {
  replace_addr = L2cacheMemory.cacheProbe(in_msg.LineAddress);
  replace_cache_entry = getL2CacheEntry(replace_addr);
  replace_tbe_entry = getTBE(replace_addr);
  trigger(Event:L2_Replacement, replace_addr, replace_cache_entry, 
replace_tbe_entry);
}
  }

  if (is_valid(L1I_cache_entry)) { 
// The tag matches for the L1, so the L1 fetches the line.  We know 
it can't be in the L2 due to exclusion
trigger(mandatory_request_type_to_event(in_msg.Type), 
in_msg.LineAddress, L1I_cache_entry, tbe_entry);
  } else {
if (L1IcacheMemory.cacheAvail(in_msg.LineAddress)) {
  // L1 does't have the line, but we have space for it in the L1
  ENTRY L2_cache_entry = getL2CacheEntry(in_msg.LineAddress);
  if (is_valid(L2_cache_entry)) {
// L2 has it (maybe not with the right permissions)
trigger(Event:Trigger_L2_to_L1I, in_msg.LineAddress, 
L2_cache_entry, tbe_entry);
  } else {
// We have room, the L2 doesn't have it, so the L1 fetches the 
line
trigger(mandatory_request_type_to_event(in_msg.Type), 
in_msg.LineAddress, L1Icache_entry, tbe_entry);
// you could also say here: 
trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress, ODD, 
ODD);
  }
} else {
  // No room in the L1, so we need to make room  
  if 
(L2cacheMemory.cacheAvail(L1IcacheMemory.cacheProbe(in_msg.LineAddress))) {
  

Re: [m5-dev] Review Request: Ruby: Update the Ruby request type names for LL/SC

2011-01-10 Thread Beckmann, Brad
Oops...I forgot to include the -o option when updating it.  I just uploaded a 
new patch...try it again.

Brad


-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Nilay
Sent: Monday, January 10, 2011 9:01 AM
To: M5 Developer List
Subject: Re: [m5-dev] Review Request: Ruby: Update the Ruby request type names 
for LL/SC

Brad, this patch also did not apply cleanly. I think the patches that you are 
trying to upload do not follow git's style.

On Mon, January 10, 2011 10:52 am, Brad Beckmann wrote:

 ---
 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/391/
 ---

 (Updated 2011-01-10 08:52:22.568922)


 Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, 
 and Nathan Binkert.


 Summary
 ---

 Ruby: Update the Ruby request type names for LL/SC


 Diffs (updated)
 -

   src/mem/ruby/libruby.hh b5d03e87db4e
   src/mem/ruby/libruby.cc b5d03e87db4e
   src/mem/ruby/recorder/TraceRecord.cc b5d03e87db4e
   src/mem/ruby/system/DMASequencer.cc b5d03e87db4e
   src/mem/ruby/system/RubyPort.cc b5d03e87db4e
   src/mem/ruby/system/Sequencer.cc b5d03e87db4e

 Diff: http://reviews.m5sim.org/r/391/diff


 Testing
 ---


 Thanks,

 Brad

 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev



--
Nilay

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

2011-01-07 Thread Beckmann, Brad
Hi Nilay,



Unfortunately I can't provide you an example of a protocol where getCacheEntry 
behaves in a different manner, but they do exist.  I reviewed your most recent 
patch updates and I don't think what we're asking for is much different than 
what you have on reviewboard right now.  Basically, all we need to do is add 
back in the capability for the programmer to write their own getCacheEntry 
function in the .sm file.  I know that I initially asked you to automatically 
generate those functions, and I still think that is useful for most protocols, 
but Lisa made me realize that we need customized getCacheEntry functions as 
well.  Also we may want to change the name of generated getCacheEntry function 
to getExclusiveCacheEntry so that one realizes the exclusive assumption made by 
the function.



Other than that, the only other change I suggest is to allow the trigger 
function to directly set the implicit cache_entry and tbe_entry variables.  
Below is example of what I'm envisioning:



Currently in MOESI_CMP_directory-L1cache.sm:



in_port(useTimerTable_in, Address, useTimerTable) {

if (useTimerTable_in.isReady()) {

set_cache_entry(getCacheEntry(useTimerTable.readyAddress()));

set_tbe(TBEs[useTimerTable.readyAddress()]);

trigger(Event:Use_Timeout, useTimerTable.readyAddress());

}

}



Replace that with the following:



in_port(useTimerTable_in, Address, useTimerTable) {

if (useTimerTable_in.isReady()) {

trigger(Event:Use_Timeout, useTimerTable.readyAddress(),

getExclusiveCacheEntry(useTimerTable.readyAddress()),

   TBEs[useTimerTable.readyAddress()]);

}

}



Please let me know if you have any questions.



Thanks...you're almost done.  :)



Brad





-Original Message-
From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
Sent: Thursday, January 06, 2011 6:32 AM
To: Beckmann, Brad
Cc: Default
Subject: RE: Review Request: Changing how CacheMemory interfaces with SLICC



Can you give me an example of a protocol where getCacheEntry() behaves in a 
different manner?



--

Nilay



On Wed, 5 Jan 2011, Beckmann, Brad wrote:



 Hi Nilay,



 Lisa Hsu (another member of the lab here at AMD) and I were discussing

 these changes a bit more and there was one particular idea that came

 out of our conversation that I wanted to relay to you.  Basically, we

 were thinking about how these changes will impact the flexibility of

 SLICC and we concluded that it is important to allow one to craft

 custom getCacheEntry functions for each protocol.  I know initially I

 was hoping to generate these functions, but I now don’t think that

 is possible without restricting what protocols can be support by SLICC.

 Instead we can use these customized getCacheEntry functions to pass

 the cache entry to the actions via the trigger function.  For those

 controllers that manage multiple cache memories, it is up to the

 programmer to understand what the cache entry pointer points to.  That

 should eliminate the need to have multiple *cacheMemory_entry

 variables in the .sm files.  Instead there is just the cache_entry

 variable that is set either by the trigger function call or set_cache_entry.



 Does that make sense to you?



 Brad




___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh

2011-01-05 Thread Beckmann, Brad
1.   Below is a snip of a protocol trace that I recently used.  I think it 
is important for us maintain that there is no DPRINTF information prepended to 
each line.  The initial motivation for the protocol trace, was that tracing 
protocol transitions using standard debug print was too verbose.  These traces 
can be 100’MB if not GBs in size, so reducing the information printed to each 
line is important.  Nilay, could you send a snip of the trace with the patch 
applied?

2233850   3L1CacheLoad  IIS [0x409ec0, line 
0x409ec0]
2233850   3L1Cache  L2_Replacement  MMI [0x40cfc0, line 
0x40cfc0]
2233866   3L1Cache   Writeback_Ack MII  [0x10bd40, line 
0x10bd40]
2233866   3L1Cache   Writeback_Ack MII  [0x40cfc0, line 
0x40cfc0]
2234458   3SeqDone  [0x4033c3, line 
0x4033c0] 3380 cycles
2234458   3L1Cache  Exclusive_Data IMMM_W   [0x4033c0, line 
0x4033c0] 0Directory-0
2234458   3Seq   Begin  [0x40f883, line 
0x40f880] ST
2234459   3L1Cache All_acks_no_sharers   MM_WMM [0x4033c0, line 
0x4033c0]
2234508   3L1Cache  Exclusive_Data ISM_W[0x409ec0, line 
0x409ec0] 0Directory-0
2234509   3L1Cache All_acks_no_sharersM_WM  [0x409ec0, line 
0x409ec0]
2234510   3L1CacheL1_to_L2  [0x4033c0, line 
0x4033c0]
2234510   3L1CacheLoad  IIS [0x407ec0, line 
0x407ec0]
2234510   3L1Cache  L2_Replacement  MMI [0x40b4c0, line 
0x40b4c0]
2234510   3L1CacheL1_to_L2  MM  [0x409ec0, line 
0x409ec0]
2234510   3L1CacheLoad  IIS [0x100c40, line 
0x100c40]
2234510   3L1Cache  L2_Replacement MMMI [0x4033c0, line 
0x4033c0]



2.   Just for my own knowledge… Nate, you mentioned that handling the 
SIGABRT signal is the right way to make this feature work for all of M5.  Why 
is that?  Is it just the preference not to use macros that overwrite the 
meaning of assert, or is it something more fundamental?

Thanks,

Brad

From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
Sent: Tuesday, January 04, 2011 7:24 PM
To: Steve Reinhardt; Ali Saidi; Gabe Black; Nathan Binkert
Cc: Nilay Vaish; Default; Beckmann, Brad
Subject: Re: Review Request: ruby: get rid of ruby's Debug.hh

This is an automatically generated e-mail. To reply, visit: 
http://reviews.m5sim.org/r/367/



On January 4th, 2011, 4:31 p.m., Brad Beckmann wrote:

Hi Nate,



I have a couple questions:



1. Have you looked at the protocol trace output after your change?  Does it 
look exactly like it did before?  It seems that the output should be the same 
based on my brief inspection of your patch, but I would like to be sure about 
that.  It may not be obvious, but there is a specific rational behind the 
format of the protocol trace and I want to make sure that stays the same.



2. With your patch applied, what happens if one hits an assert when running 
interactively?  Previously, the process would abort allowing one to attach gdb 
and examine what is going on.  I liked that feature and it would be great if we 
could maintain it.  Could we port that feature to all of M5?

On January 4th, 2011, 6:05 p.m., Nathan Binkert wrote:

1) I have not, because I don't know how, but I tried hard to make it exactly 
the same.  Can you help me out?  It won't look identical because DPRINTF 
prepends some stuff (curTick and object name)



2) we don't have a mechanism to have the process stall until GDB is attached, 
but given that this worked in Ruby only, I'd agree that this should be 
something that we do globally in M5.  The right way to do this would be to 
handle SIGABRT and stall in the abort handler (I think that should work).  Can 
we work on this patch and do that as a separate one?

Brad, do you have some protocol trace with you? I have seen the trace that gets 
generated with the current trace facility using Ruby trace flag. It prints all 
the events for all the cache controllers and network routers. If you prefer, I 
can send you an example trace. Or you can generate one by running m5.opt with 
trace file and trace flag options supplied.



./build/ALPHA_SE_MESI_CMP_directory/m5.opt  --trace-file=MESI.trace  
--trace-flags=Ruby ./configs/example/ruby_random_test.py -l 1000


- Nilay


On January 4th, 2011, 3:02 p.m., Nathan Binkert wrote:
Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan 
Binkert.
By Nathan Binkert.

Updated 2011-01-04 15:02:38

Description

ruby: get rid of ruby's Debug.hh



Get rid of the Debug class

Get rid of ASSERT and use assert

Use DPRINTF for ProtocolTrace


Testing

This compiles and passes all of the quick regressions, but it would be nice for 
a Ruby developer to take a look and see if I got rid of any useful 
functionality.


Diffs

 *   configs/ruby/Ruby.py (7338bc628489)
 *   src/mem/SConscript

Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh

2011-01-05 Thread Beckmann, Brad
Is it possible to fix the width of the information prepended by DPRINTF?  I 
would be great if we could maintain the current fixed width format.

Brad


-Original Message-
From: bink...@gmail.com [mailto:bink...@gmail.com] On Behalf Of nathan binkert
Sent: Wednesday, January 05, 2011 10:36 AM
To: M5 Developer List
Cc: Beckmann, Brad
Subject: Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh

Looks like we should just remove the first, second, and third columns that are 
spit out since they're covered almost exactly by the implicit columns added by 
DPRINTF.  Right?

  Nate

 This how protocol trace would look like. I actually did not some such 
 thing exists. I was currently relying on DPRINTF statements for 
 checking the events that occurred. This is certainly easier to read 
 and much more compact.

   1395: system.l1_cntrl3:    1395   3    L1Cache      L1_Replacement
 IMIM     [0x15c0, line 0x15c0]
   1395: system.l1_cntrl2:    1395   2    L1Cache      L1_Replacement
 IMIM     [0x3ac0, line 0x3ac0]
   1395: system.l1_cntrl2:    1395   2    L1Cache      L1_Replacement
 IMIM     [0x3ac0, line 0x3ac0]
   1395: system.l1_cntrl2:    1395   2    L1Cache      L1_Replacement
 IMIM     [0x3ac0, line 0x3ac0]
   1396: system.l1_cntrl1:    1396   1    L1Cache      L1_Replacement
 IMIM     [0x2ac0, line 0x2ac0]
   1397: system.ruby.cpu_ruby_ports2:    1397   2        Seq
  Done              [0x3ae8, line 0x3ac0] 1386 cycles
   1397: system.l1_cntrl2:    1397   2    L1Cache       Data_all_Acks
 IMM      [0x3ac0, line 0x3ac0]
   1397: system.l1_cntrl3:    1397   3    L1Cache      L1_Replacement
 IMIM     [0x15c0, line 0x15c0]
   1397: system.l2_cntrl0:    1397   0    L2CacheL2_Replacement_clean 
 MT_MBMT_MB  [0x3ac0, line 0x3ac0]
   1400: system.l2_cntrl0:    1400   0    L2Cache             L1_GETX 
 MT_MBMT_MB  [0x400, line 0x400]
   1401: system.l2_cntrl0:    1401   0    L2CacheL2_Replacement_clean 
 MT_MBMT_MB  [0x3ac0, line 0x3ac0]
   1402: system.l1_cntrl0:    1402   0    L1Cache      L1_Replacement
 IMIM     [0x4dc0, line 0x4dc0]
   1402: system.l1_cntrl2:    1402   2    L1Cache      L1_Replacement
 IMIM     [0xdc0, line 0xdc0]
   1402: system.l1_cntrl2:    1402   2    L1Cache      L1_Replacement
 IMIM     [0xdc0, line 0xdc0]



 On Wed, January 5, 2011 10:26 am, Beckmann, Brad wrote:
 1.       Below is a snip of a protocol trace that I recently used.  I 
 think it is important for us maintain that there is no DPRINTF 
 information prepended to each line.  The initial motivation for the 
 protocol trace, was that tracing protocol transitions using standard 
 debug print was too verbose.  These traces can be 100’MB if not GBs 
 in size, so reducing the information printed to each line is 
 important.  Nilay, could you send a snip of the trace with the patch applied?

 2233850   3    L1Cache                Load      IIS     [0x409ec0, 
 line 0x409ec0]
 2233850   3    L1Cache      L2_Replacement      MMI     [0x40cfc0, 
 line 0x40cfc0]
 2233866   3    L1Cache       Writeback_Ack     MII      [0x10bd40, 
 line 0x10bd40]
 2233866   3    L1Cache       Writeback_Ack     MII      [0x40cfc0, 
 line 0x40cfc0]
 2234458   3        Seq                Done              [0x4033c3, 
 line 0x4033c0] 3380 cycles
 2234458   3    L1Cache      Exclusive_Data     IMMM_W   [0x4033c0, 
 line 0x4033c0] 0Directory-0
 2234458   3        Seq               Begin              [0x40f883, 
 line 0x40f880] ST
 2234459   3    L1Cache All_acks_no_sharers   MM_WMM     [0x4033c0, 
 line 0x4033c0]
 2234508   3    L1Cache      Exclusive_Data     ISM_W    [0x409ec0, 
 line 0x409ec0] 0Directory-0
 2234509   3    L1Cache All_acks_no_sharers    M_WM      [0x409ec0, 
 line 0x409ec0]
 2234510   3    L1Cache            L1_to_L2          [0x4033c0, 
 line 0x4033c0]
 2234510   3    L1Cache                Load      IIS     [0x407ec0, 
 line 0x407ec0]
 2234510   3    L1Cache      L2_Replacement      MMI     [0x40b4c0, 
 line 0x40b4c0]
 2234510   3    L1Cache            L1_to_L2      MM      [0x409ec0, 
 line 0x409ec0]
 2234510   3    L1Cache                Load      IIS     [0x100c40, 
 line 0x100c40]
 2234510   3    L1Cache      L2_Replacement     MMMI     [0x4033c0, 
 line 0x4033c0]



 2.       Just for my own knowledge… Nate, you mentioned that 
 handling the SIGABRT signal is the right way to make this feature 
 work for all of M5.  Why is that?  Is it just the preference not to 
 use macros that overwrite the meaning of assert, or is it something more 
 fundamental?

 Thanks,

 Brad

 From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
 Sent: Tuesday, January 04, 2011 7:24 PM
 To: Steve Reinhardt; Ali Saidi; Gabe Black; Nathan Binkert
 Cc: Nilay Vaish; Default; Beckmann, Brad
 Subject: Re: Review Request: ruby: get rid of ruby's Debug.hh

 This is an automatically generated e-mail. To reply, visit:
 http://reviews.m5sim.org/r/367/



 On January 4th, 2011, 4:31 p.m., Brad Beckmann wrote:

 Hi Nate,



 I have a couple questions

Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh

2011-01-05 Thread Beckmann, Brad
So if we explicitly handled the SIGABRT signal, we would only want to do that 
if we are running interactively, correct?  If so, then we would still have some 
sort of conditional similar, if not the same as, the current conditional in the 
assert macro if (isatty(STDIN_FILENO)).  If my understanding is correct, then 
we would still have multiple behaviors for assert.  One when the running 
interactively and another when running in batch mode.

Am I missing something?  I just want to make sure I understand why we don't 
want to just move the current Ruby ASSERT macro into src/base/debug.hh (or some 
other file in src/base).

Thanks,

Brad


From: bink...@gmail.com [mailto:bink...@gmail.com] On Behalf Of nathan binkert
Sent: Wednesday, January 05, 2011 10:40 AM
To: Beckmann, Brad
Cc: Nilay Vaish; Steve Reinhardt; Ali Saidi; Gabe Black; Default
Subject: Re: Review Request: ruby: get rid of ruby's Debug.hh

2.   Just for my own knowledge... Nate, you mentioned that handling the 
SIGABRT signal is the right way to make this feature work for all of M5.  Why 
is that?  Is it just the preference not to use macros that overwrite the 
meaning of assert, or is it something more fundamental?
Not fundamental.  Mostly because we don't want multiple meanings of assert.  It 
seems that if we could get this to work properly that it would be easiest as 
well.

  Nate
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh

2011-01-05 Thread Beckmann, Brad
Yeah, that seems rather tedious.  Let's just use DPRINTFN and maintain the 
current format.  As long as the protocol trace format looks the same as before, 
I'm happy with the change.

Brad


-Original Message-
From: bink...@gmail.com [mailto:bink...@gmail.com] On Behalf Of nathan binkert
Sent: Wednesday, January 05, 2011 12:30 PM
To: Beckmann, Brad
Cc: M5 Developer List
Subject: Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh

 Is it possible to fix the width of the information prepended by DPRINTF?  I 
 would be great if we could maintain the current fixed width format.

That might be hard (and may argue for DPRINTFN).  In practice, when I want 
that, I usually just ensure that my object names end up not varying in length.

e.g. system0.cpu0.l1_foo0.  If I have more than 10 things, I make the name 
cpu00 or something like that.

  Nate


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

2011-01-05 Thread Beckmann, Brad
Hi Nilay,

Lisa Hsu (another member of the lab here at AMD) and I were discussing these 
changes a bit more and there was one particular idea that came out of our 
conversation that I wanted to relay to you.  Basically, we were thinking about 
how these changes will impact the flexibility of SLICC and we concluded that it 
is important to allow one to craft custom getCacheEntry functions for each 
protocol.  I know initially I was hoping to generate these functions, but I now 
don’t think that is possible without restricting what protocols can be support 
by SLICC.  Instead we can use these customized getCacheEntry functions to pass 
the cache entry to the actions via the trigger function.  For those controllers 
that manage multiple cache memories, it is up to the programmer to understand 
what the cache entry pointer points to.  That should eliminate the need to have 
multiple *cacheMemory_entry  variables in the .sm files.  Instead there is just 
the cache_entry variable that is set either by the trigger function call or 
set_cache_entry.

Does that make sense to you?

Brad


From: Nilay Vaish [mailto:ni...@cs.wisc.edu]
Sent: Tuesday, January 04, 2011 9:43 AM
To: Nilay Vaish; Default; Beckmann, Brad
Subject: Re: Review Request: Changing how CacheMemory interfaces with SLICC

This is an automatically generated e-mail. To reply, visit: 
http://reviews.m5sim.org/r/358/



On January 3rd, 2011, 3:31 p.m., Brad Beckmann wrote:

Hi Nilay,



First, I must say this is an impressive amount of work.  You definitely got a 
lot done over holiday break. :)



Overall, this set of patches is definitely close, but I want to see if we can 
take them a step forward.  Also I have a few suggestions that may make things 
easier.  Finally, I have a bunch of minor questions/suggestions on individual 
lines, but I’ll hold off on those until you can respond to my higher-level 
questions.



The main thing I would like to see improved is not having to differentiate 
between “entry” and “entry_ptr” in the .sm files.  Am I correct that the only 
functions in the .sm files that are passed an “entry_ptr” are “is_valid_ptr”, 
“getCacheEntry”, and “set_cache_entry”?  If so, it seems that all three 
functions are generated with unique python code, either in an AST file or 
StateMachine.py.  Therefore, could we just pass these functions “entry” and 
rely on the underneath python code to generate the correct references?  This 
would make things more readable, “is_valid_ptr()” becomes “is_valid”, and it 
doesn’t require the slicc programmer to understand which functions take an 
entry pointer versus the entry itself.  If we can’t make such a change, I worry 
about how much extra complexity this change pushes on the slicc programmer.



Also another suggestion to make things more readable, please replace the name 
L1IcacheMemory_entry with L1I_entry.  Do the same for L1D_entry and L2_entry.  
That will shorten many of your lines.



So am I correct that hammer’s simultaneous usage of valid L1 and L2 cache 
entries in certain transitions is the only reason that within all actions, the 
getCacheEntry calls take multiple cache entries?  If so, I think it would be 
fairly trivial to use a tbe entry as an intermediary between the L1 and L2 for 
those particular hammer transitions.  That way only one cache entry is valid at 
any particular time, and we can simply use the variable cache_entry in the 
actions.  That should clean things up a lot.



By the way, once you check in these patches, the MESI_CMP_directory protocol 
will be deprecated, correct?  If so, make sure you include a patch that removes 
it from the regression tester.



Brad





 The main thing I would like to see improved is not having to differentiate

 between “entry” and “entry_ptr” in the .sm files.  Am I correct

 that the only functions in the .sm files that are passed an

 “entry_ptr” are “is_valid_ptr”, “getCacheEntry”, and

 “set_cache_entry”?  If so, it seems that all three functions are

 generated with unique python code, either in an AST file or

 StateMachine.py.  Therefore, could we just pass these functions

 “entry” and rely on the underneath python code to generate the correct

 references?  This would make things more readable, “is_valid_ptr()”

 becomes “is_valid”, and it doesn’t require the slicc programmer to

 understand which functions take an entry pointer versus the entry itself.

 If we can’t make such a change, I worry about how much extra complexity

 this change pushes on the slicc programmer.



There are functions that are passed cache entry and transaction buffer entry as 
arguments. Currently, I assume that these arguments are passed using pointers.





 Also another suggestion to make things more readable, please replace the

 name L1IcacheMemory_entry with L1I_entry.  Do the same for L1D_entry and

 L2_entry.  That will shorten many of your lines.



The names of the cache entry variables

Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

2011-01-04 Thread Beckmann, Brad
Hi Nilay,

My responses are below:

 The main thing I would like to see improved is not having to differentiate
 between “entry” and “entry_ptr” in the .sm files.  Am I correct
 that the only functions in the .sm files that are passed an
 “entry_ptr” are “is_valid_ptr”, “getCacheEntry”, and
 “set_cache_entry”?  If so, it seems that all three functions are
 generated with unique python code, either in an AST file or
 StateMachine.py.  Therefore, could we just pass these functions
 “entry” and rely on the underneath python code to generate the correct
 references?  This would make things more readable, “is_valid_ptr()”
 becomes “is_valid”, and it doesn’t require the slicc programmer to
 understand which functions take an entry pointer versus the entry itself. 
 If we can’t make such a change, I worry about how much extra complexity
 this change pushes on the slicc programmer.

There are functions that are passed cache entry and transaction buffer entry as 
arguments. Currently, I assume that these arguments are passed using pointers.

[BB] So does that mean that the cache entry is always passed in as a pointer?  
If so, can one just use cache_entry for all function calls and remove any use 
of cache_entry_ptr in the .sm files?  That is essentially what I would like 
to see.  

 
 Also another suggestion to make things more readable, please replace the
 name L1IcacheMemory_entry with L1I_entry.  Do the same for L1D_entry and
 L2_entry.  That will shorten many of your lines.

The names of the cache entry variables are currently tied with the names of the 
cache memory variables belonging to the machine. If the name of the cache 
memory variable is A, then the corresponding cache entry variable is named 
A_entry.

[BB] Ah, I see.  Ok then let's just keep them the way they are for now.  We can 
deal with shorting the names later.

 So am I correct that hammer’s simultaneous usage of valid L1 and L2
 cache entries in certain transitions is the only reason that within all
 actions, the getCacheEntry calls take multiple cache entries?  If so, I
 think it would be fairly trivial to use a tbe entry as an intermediary
 between the L1 and L2 for those particular hammer transitions.  That way
 only one cache entry is valid at any particular time, and we can simply
 use the variable cache_entry in the actions.  That should clean things up
 a lot.

Oops! Should have thought of that before doing all those changes. But can we 
assume that we would always have only one valid cache entry pointer at any 
given time? If that's true, I would probably revert to previous version of the 
patch. This should also resolve the naming issue.

[BB] I wouldn't have expected you to realize that.  It is one of those things 
that isn't completely obvious without spending a lot of time developing 
protocols.  Yes, I think it is easiest for you to just revert to the previous 
version of the patch and just modify the hammer protocol to use a tbe entry as 
an intermediary.  We've always had an unofficial rule that a controller can 
only manage multiple caches if those caches are exclusive with respect to each 
other.  For the most part, that rule has been followed by all the protocols I'm 
familiar with.  I think your change just makes that an official policy.

 By the way, once you check in these patches, the MESI_CMP_directory
 protocol will be deprecated, correct?  If so, make sure you include a
 patch that removes it from the regression tester.

I have a patch for the protocol, but I need to discuss it. Do you think it is 
possible that a protocol is not in a dead lock but random tester outputs so 
because the underlying memory system is taking too much time? The patch works 
for 1, 2, and 4 processors for 10,000,000 loads. I have tested these processor 
configurations with 40 different seed values. But for 8 processors, random 
tester outputs some thing like this --

panic: Possible Deadlock detected. Aborting!
version: 6 request.paddr: 12779 m_writeRequestTable: 15 current time: 369500011 
issue_time: 368993771 difference: 506240
 @ cycle 369500011
[wakeup:build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc, line 
123]

[BB] Yes, the current version of MESI_CMP_directory is broken in many places.  
Arka just told me that he recently fixed many of those problems.  I suggest 
getting his fixes and working from there.

Brad
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC

2011-01-04 Thread Beckmann, Brad
Hi Nilay,

At one point in time, the combination of several letters at the beginning of 
the action name corresponded to the short hand name for the action.  The short 
hand name is the letter or letter combination that appears in the HTML tables.  
SLICC may have once enforced that the combination of letters matched the HTML 
short hand name, but I don't believe it does right now.  Therefore the letters 
are just a guide to match actions with their associated short hand name.  And 
yes, you can use any combination.

Brad


-Original Message-
From: Nilay Vaish [mailto:ni...@cs.wisc.edu] 
Sent: Tuesday, January 04, 2011 12:03 PM
To: Beckmann, Brad
Cc: Default
Subject: RE: Review Request: Changing how CacheMemory interfaces with SLICC

Brad

Is there a reason why each action name follows the pattern combination of 
several letters_action performed by the action? The letters used are not 
abbreviations of the action performed. Can we use any combination?

Thanks
Nilay

On Tue, 4 Jan 2011, Beckmann, Brad wrote:

 Hi Nilay,

 My responses are below:

 The main thing I would like to see improved is not having to 
 differentiate between “entry” and “entry_ptr” in the .sm 
 files.  Am I correct that the only functions in the .sm files that 
 are passed an “entry_ptr” are “is_valid_ptr”, 
 “getCacheEntry”, and “set_cache_entry”?  If so, it seems that 
 all three functions are generated with unique python code, either in 
 an AST file or StateMachine.py.  Therefore, could we just pass these 
 functions “entry” and rely on the underneath python code to 
 generate the correct references?  This would make things more 
 readable, “is_valid_ptr()” becomes “is_valid”, and it 
 doesn’t require the slicc programmer to understand which functions take an 
 entry pointer versus the entry itself.
 If we can’t make such a change, I worry about how much extra 
 complexity this change pushes on the slicc programmer.

 There are functions that are passed cache entry and transaction buffer entry 
 as arguments. Currently, I assume that these arguments are passed using 
 pointers.

 [BB] So does that mean that the cache entry is always passed in as a pointer? 
  If so, can one just use cache_entry for all function calls and remove any 
 use of cache_entry_ptr in the .sm files?  That is essentially what I would 
 like to see.


 Also another suggestion to make things more readable, please replace 
 the name L1IcacheMemory_entry with L1I_entry.  Do the same for 
 L1D_entry and L2_entry.  That will shorten many of your lines.

 The names of the cache entry variables are currently tied with the names of 
 the cache memory variables belonging to the machine. If the name of the cache 
 memory variable is A, then the corresponding cache entry variable is named 
 A_entry.

 [BB] Ah, I see.  Ok then let's just keep them the way they are for now.  We 
 can deal with shorting the names later.

 So am I correct that hammer’s simultaneous usage of valid L1 and L2 
 cache entries in certain transitions is the only reason that within 
 all actions, the getCacheEntry calls take multiple cache entries?  If 
 so, I think it would be fairly trivial to use a tbe entry as an 
 intermediary between the L1 and L2 for those particular hammer 
 transitions.  That way only one cache entry is valid at any 
 particular time, and we can simply use the variable cache_entry in 
 the actions.  That should clean things up a lot.

 Oops! Should have thought of that before doing all those changes. But can we 
 assume that we would always have only one valid cache entry pointer at any 
 given time? If that's true, I would probably revert to previous version of 
 the patch. This should also resolve the naming issue.

 [BB] I wouldn't have expected you to realize that.  It is one of those things 
 that isn't completely obvious without spending a lot of time developing 
 protocols.  Yes, I think it is easiest for you to just revert to the previous 
 version of the patch and just modify the hammer protocol to use a tbe entry 
 as an intermediary.  We've always had an unofficial rule that a controller 
 can only manage multiple caches if those caches are exclusive with respect to 
 each other.  For the most part, that rule has been followed by all the 
 protocols I'm familiar with.  I think your change just makes that an official 
 policy.

 By the way, once you check in these patches, the MESI_CMP_directory 
 protocol will be deprecated, correct?  If so, make sure you include a 
 patch that removes it from the regression tester.

 I have a patch for the protocol, but I need to discuss it. Do you 
 think it is possible that a protocol is not in a dead lock but random 
 tester outputs so because the underlying memory system is taking too 
 much time? The patch works for 1, 2, and 4 processors for 10,000,000 
 loads. I have tested these processor configurations with 40 different 
 seed values. But for 8 processors, random

Re: [m5-dev] Deadlock while running ruby_random_test.py

2010-12-22 Thread Beckmann, Brad
Hi Nilay,

The following protocols (all of which are tested by the regression tester) 
should correctly work with the ruby random tester.

MOESI_CMP_directory
MOESI_hammer
MOESI_token
MI_example

Brad


-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Nilay Vaish
Sent: Tuesday, December 21, 2010 6:24 PM
To: M5 Developer List
Subject: Re: [m5-dev] Deadlock while running ruby_random_test.py

Brad, which protocols work correctly with ruby random tester?

On Tue, 21 Dec 2010, Beckmann, Brad wrote:

 Hi Nilay,

 If I'm correctly reproducing your problem, I believe I know what the issue 
 is.  However, before I try to fix it, I want to propose simply getting rid of 
 the MESI_CMP_directory.  The more and more I look at that protocol, the more 
 problems I see.  There are several design and logic issues in the protocol.  
 Unless someone wants to volunteer to fix them, I say we get rid of it as well 
 as all of the protocols not being tested by the regression tester.

 Now the particular problem that I see causing the deadlock is that that L2 
 cache is drop a PUTX request from the L1 because the L2 is in SS_MB state.  
 Thus the L1 remains in M_I state for infinity which of course will eventually 
 lead to a deadlock.

 Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
 Nilay Vaish
 Sent: Tuesday, December 21, 2010 1:04 PM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] Deadlock while running ruby_random_test.py

 I am running ALPHA_SE_MESI_CMP_directory with ruby_random_test.py. I supply 
 the option -l as 2000. I have pasted the output below. This was generated 
 using latest version of m5.

 Actually, while testing my own changes to SLICC and protocol files, I also 
 observe the dead lock at the 301. So I ran the latest version and found 
 even that gets stuck.

 Is this a known problem? Am I doing some thing wrong?

 Thanks
 Nilay

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Deadlock while running ruby_random_test.py

2010-12-22 Thread Beckmann, Brad
Yep that is the beauty of the random tester.  It is much easier to fix problems 
when you can reproduce them in 3 M cycles vs. 200 B.

Brad


-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Nilay
Sent: Tuesday, December 21, 2010 8:00 PM
To: M5 Developer List
Subject: Re: [m5-dev] Deadlock while running ruby_random_test.py

It is kind of surprising that random tester can detect the bug with in
3,000,000 cycles while nothing happened on running ruby_fs.py for
200,000,000,000 cycles.


On Tue, December 21, 2010 8:24 pm, Nilay Vaish wrote:
 Brad, which protocols work correctly with ruby random tester?

 On Tue, 21 Dec 2010, Beckmann, Brad wrote:

 Hi Nilay,

 If I'm correctly reproducing your problem, I believe I know what the 
 issue is.  However, before I try to fix it, I want to propose simply 
 getting rid of the MESI_CMP_directory.  The more and more I look at 
 that protocol, the more problems I see.  There are several design and 
 logic issues in the protocol.  Unless someone wants to volunteer to 
 fix them, I say we get rid of it as well as all of the protocols not 
 being tested by the regression tester.

 Now the particular problem that I see causing the deadlock is that 
 that
 L2 cache is drop a PUTX request from the L1 because the L2 is in 
 SS_MB state.  Thus the L1 remains in M_I state for infinity which of 
 course will eventually lead to a deadlock.

 Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On 
 Behalf Of Nilay Vaish
 Sent: Tuesday, December 21, 2010 1:04 PM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] Deadlock while running ruby_random_test.py

 I am running ALPHA_SE_MESI_CMP_directory with ruby_random_test.py. I 
 supply the option -l as 2000. I have pasted the output below. This 
 was generated using latest version of m5.

 Actually, while testing my own changes to SLICC and protocol files, I 
 also observe the dead lock at the 301. So I ran the latest 
 version and found even that gets stuck.

 Is this a known problem? Am I doing some thing wrong?

 Thanks
 Nilay

 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev



--
Nilay

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Deadlock while running ruby_random_test.py

2010-12-21 Thread Beckmann, Brad
Hi Nilay,

If I'm correctly reproducing your problem, I believe I know what the issue is.  
However, before I try to fix it, I want to propose simply getting rid of the 
MESI_CMP_directory.  The more and more I look at that protocol, the more 
problems I see.  There are several design and logic issues in the protocol.  
Unless someone wants to volunteer to fix them, I say we get rid of it as well 
as all of the protocols not being tested by the regression tester.

Now the particular problem that I see causing the deadlock is that that L2 
cache is drop a PUTX request from the L1 because the L2 is in SS_MB state.  
Thus the L1 remains in M_I state for infinity which of course will eventually 
lead to a deadlock.

Brad


-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Nilay Vaish
Sent: Tuesday, December 21, 2010 1:04 PM
To: m5-dev@m5sim.org
Subject: [m5-dev] Deadlock while running ruby_random_test.py

I am running ALPHA_SE_MESI_CMP_directory with ruby_random_test.py. I supply the 
option -l as 2000. I have pasted the output below. This was generated using 
latest version of m5.

Actually, while testing my own changes to SLICC and protocol files, I also 
observe the dead lock at the 301. So I ran the latest version and found 
even that gets stuck.

Is this a known problem? Am I doing some thing wrong?

Thanks
Nilay

-
M5 Simulator System

Copyright (c) 2001-2008
The Regents of The University of Michigan All Rights Reserved


M5 compiled Dec 21 2010 14:51:00
M5 revision 85e1847726e3 7798 default tip
M5 started Dec 21 2010 14:52:30
M5 executing on scamorza.cs.wisc.edu
command line: ./build/ALPHA_SE_MESI_CMP_directory/m5.debug
./configs/example/ruby_random_test.py -l 2000 Global frequency set at 
10 ticks per second
info: Entering event queue @ 0.  Starting simulation...
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:102: 
Possible Deadlock detected
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:102: 
Possible Deadlock detected
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:103: 
m_version is 0
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:103: 
m_version is 0
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:104: 
request-ruby_request.paddr is 1092
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:104: 
request-ruby_request.paddr is 1092
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:105: 
m_readRequestTable.size() is 4
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:105: 
m_readRequestTable.size() is 4
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:106: 
current_time is 301
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:106: 
current_time is 301
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:107: 
request-issue_time is 2292161
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:107: 
request-issue_time is 2292161
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:108: 
current_time - request-issue_time is 707840
Warning: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:108: 
current_time - request-issue_time is 707840 Fatal Error: in fn virtual void 
Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:109: 
Aborting
Fatal Error: in fn virtual void Sequencer::wakeup() in
build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:109: 
Aborting
Program aborted at cycle 301
Aborted

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Implementation of findTagInSet

2010-12-20 Thread Beckmann, Brad
Hi Nilay,

I apologize for the delay, but I was mostly travelling / in meetings last week 
and I didn't have a chance to review your patches and emails until this morning.

Overall, your patches are definitely solid steps in the right direction and 
your profiling data sounds very promising.  If you get the chance, please send 
it to me.  I would be interested to know what are the top performance 
bottlenecks after your change.

Before you spend time converting the other protocols, I do want to discuss the 
three points you brought up last week (see below).  I have a bunch of free time 
over the next three days (Mon. - Wed.) and I do think a telephone conversation 
is best to discuss these details.  Let me know what times work for you.

Brad


1. Currently the implicit TBE and Cache Entry pointers are set to NULL in the 
calls to doTransition() function. To set these, we would need to make calls to 
a function that returns the pointer if the address is in the cache, NULL 
otherwise.

I think we should retain the getEntry functions in the .sm files for in case of 
L1 cache both instruction and the data cache needs to be checked. 
This is something that I probably would prefer keeping out of SLICC. In fact, 
we should add getEntry functions for TBEs where ever required.

These getEntry would now return a pointer instead of a reference. We would need 
to add support for return_by_pointer to SLICC. Also, since these functions 
would be used inside the Wakeup function, we would need to assume a common name 
for them across all protocols, just like getState() function.

[BB] I would be very interested why you believe we should keep the getEntry 
functions out of SLICC.  In my mind, this is one of the few functions that is 
very consistent across protocols.  As I mentioned before, I really want to keep 
any notion of pointers out of the .sm files and avoid the changes you are 
proposing to getCacheEntry.  We should probably discuss this in detail 
over-the-phone.

2. I still think we would need to change the changePermission function in the 
CacheMemory class. Presently it calls findTagInSet() twice. Instead, we would 
pass on the CacheEntry whose permissions need to be changed. This would save 
one call. We should also put the variable m_locked in the AbstractCacheEntry 
(may be make it part of the permission variable) to avoid the second call.

[BB] I like moving the locked field to AbstractCacheEntry and removing the 
separate m_locked data structure.  However, just a minor point, but we should 
avoid duplicating code in CacheMemory to support this change.  Other than that, 
this looks good to me.

3. In the getState() and setState() functions, we need to specify that the 
function assumes that implicit TBE and CacheEntry pointers have been passed as 
arguments. How should we do this? I think we would need to push them in to the 
symbol table before they can be used in side the function.

[BB] I'm a little confused by your current patch.  It appears that you are 
proposing having two pairs of getState and setState functions.  I would really 
like to avoid that and just have one pair of getState and setState functions.  
Also when I say implicitly pass the TBE and CacheEntry pointers, I mean that 
for the actions (similar to address).  However, I think it is fine to 
explicitly pass these parameters into getState and setState (also similar to 
Address and State).  


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] changeset in m5: ARM: Add checkpointing support

2010-12-09 Thread Beckmann, Brad
Hi Ali,

I just synced with this changeset 7733, as well as changeset 7730, and I now 
notice that the modifications to physical.cc break all previous checkpoints.  
Can we put the lal_addr and lal_cid serialization and unserialization in a 
conditional that tests for the ARM ISA?  I welcome other suggestions as well.

In general, I would be interested to hear other people's thoughts on adding a 
checkpoint test to the regression tester.  It would be great if we can at least 
identify ahead of time what changesets break older checkpoints.

Brad



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Ali Saidi
 Sent: Monday, November 08, 2010 11:59 AM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] changeset in m5: ARM: Add checkpointing support
 
 changeset 08d6a773d1b6 in /z/repo/m5
 details: http://repo.m5sim.org/m5?cmd=changeset;node=08d6a773d1b6
 description:
   ARM: Add checkpointing support
 
 diffstat:
 
  src/arch/arm/isa.hh  |  12 +-
  src/arch/arm/linux/system.cc |   5 +-
  src/arch/arm/linux/system.hh |   4 +-
  src/arch/arm/pagetable.hh|  87 +++
 
  src/arch/arm/table_walker.cc |  16 ++-
  src/arch/arm/table_walker.hh |   2 +-
  src/arch/arm/tlb.cc  |  14 ++-
  src/arch/arm/tlb.hh  |   2 -
  src/dev/arm/gic.cc   |  44 +-
  src/dev/arm/pl011.cc |  42 -
  src/dev/arm/rv_ctrl.cc   |   2 -
  src/dev/arm/timer_sp804.cc   |  59 -
  src/dev/arm/timer_sp804.hh   |   4 ++
  src/mem/physical.cc  |  30 +++
  src/mem/physical.hh  |   5 ++
  src/sim/SConscript   |   1 +
  src/sim/system.cc|   2 +-
  src/sim/system.hh|   2 +-
  18 files changed, 268 insertions(+), 65 deletions(-)
 
 diffs (truncated from 587 to 300 lines):
 
 diff -r a2c660de7787 -r 08d6a773d1b6 src/arch/arm/isa.hh
 --- a/src/arch/arm/isa.hh Mon Nov 08 13:58:24 2010 -0600
 +++ b/src/arch/arm/isa.hh Mon Nov 08 13:58:25 2010 -0600
 @@ -178,10 +178,18 @@
  }
 
  void serialize(EventManager *em, std::ostream os)
 -{}
 +{
 +DPRINTF(Checkpoint, Serializing Arm Misc Registers\n);
 +SERIALIZE_ARRAY(miscRegs, NumMiscRegs);
 +}
  void unserialize(EventManager *em, Checkpoint *cp,
  const std::string section)
 -{}
 +{
 +DPRINTF(Checkpoint, Unserializing Arm Misc Registers\n);
 +UNSERIALIZE_ARRAY(miscRegs, NumMiscRegs);
 +CPSR tmp_cpsr = miscRegs[MISCREG_CPSR];
 +updateRegMap(tmp_cpsr);
 +}
 
  ISA()
  {
 diff -r a2c660de7787 -r 08d6a773d1b6 src/arch/arm/linux/system.cc
 --- a/src/arch/arm/linux/system.ccMon Nov 08 13:58:24 2010 -0600
 +++ b/src/arch/arm/linux/system.ccMon Nov 08 13:58:25 2010 -0600
 @@ -99,9 +99,9 @@
  }
 
  void
 -LinuxArmSystem::startup()
 +LinuxArmSystem::initState()
  {
 -ArmSystem::startup();
 +ArmSystem::initState();
  ThreadContext *tc = threadContexts[0];
 
  // Set the initial PC to be at start of the kernel code
 @@ -117,7 +117,6 @@
  {
  }
 
 -
  LinuxArmSystem *
  LinuxArmSystemParams::create()
  {
 diff -r a2c660de7787 -r 08d6a773d1b6 src/arch/arm/linux/system.hh
 --- a/src/arch/arm/linux/system.hhMon Nov 08 13:58:24 2010 -0600
 +++ b/src/arch/arm/linux/system.hhMon Nov 08 13:58:25 2010 -0600
 @@ -67,8 +67,8 @@
  LinuxArmSystem(Params *p);
  ~LinuxArmSystem();
 
 -/** Initialize the CPU for booting */
 -void startup();
 +void initState();
 +
private:
  #ifndef NDEBUG
  /** Event to halt the simulator if the kernel calls panic()  */
 diff -r a2c660de7787 -r 08d6a773d1b6 src/arch/arm/pagetable.hh
 --- a/src/arch/arm/pagetable.hh   Mon Nov 08 13:58:24 2010 -0600
 +++ b/src/arch/arm/pagetable.hh   Mon Nov 08 13:58:25 2010 -0600
 @@ -48,6 +48,8 @@
  #include arch/arm/vtophys.hh
  #include config/full_system.hh
 
 +#include sim/serialize.hh
 +
  namespace ArmISA {
 
  struct VAddr
 @@ -71,39 +73,6 @@
 
  };
 
 -struct TlbRange
 -{
 -Addr va;
 -Addr size;
 -int contextId;
 -bool global;
 -
 -inline bool
 -operator(const TlbRange r2) const
 -{
 -if (!(global || r2.global)) {
 -if (contextId  r2.contextId)
 -return true;
 -else if (contextId  r2.contextId)
 -return false;
 -}
 -
 -if (va  r2.va)
 -return true;
 -return false;
 -}
 -
 -inline bool
 -operator==(const TlbRange r2) const
 -{
 -return va == r2.va 
 -   size == r2.size 
 -   contextId == r2.contextId 
 -   global == r2.global;
 -}
 -};
 -
 -
  // ITB/DTB table entry
  struct TlbEntry
  {
 @@ -143,10 +112,8 @@
 
  // Access permissions
  bool xn;

Re: [m5-dev] changeset in m5: Mem: Finish half-baked support for mmaping file...

2010-12-09 Thread Beckmann, Brad
Hi Ali,

This is changset 7730 which also breaks all previous checkpoints because it 
requires phsymem to serialize and unserialize the variable _size.

Brad

 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Ali Saidi
 Sent: Monday, November 08, 2010 11:59 AM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] changeset in m5: Mem: Finish half-baked support for
 mmaping file...
 
 changeset 982b4c6c1470 in /z/repo/m5
 details: http://repo.m5sim.org/m5?cmd=changeset;node=982b4c6c1470
 description:
   Mem: Finish half-baked support for mmaping file in physmem.
 
   Physmem has a parameter to be able to mem map a file, however
   it isn't actually used. This changeset utilizes the parameter
   so a file can be mmapped.
 
 diffstat:
 
  configs/common/FSConfig.py |   8 ++-
  src/mem/physical.cc|  48 +++--
 
  src/mem/physical.hh|   8 +++---
  3 files changed, 44 insertions(+), 20 deletions(-)
 
 diffs (176 lines):
 
 diff -r d3c006ecccd3 -r 982b4c6c1470 configs/common/FSConfig.py
 --- a/configs/common/FSConfig.py  Mon Nov 08 13:58:24 2010 -0600
 +++ b/configs/common/FSConfig.py  Mon Nov 08 13:58:24 2010 -0600
 @@ -200,9 +200,12 @@
  self.membus.badaddr_responder.warn_access = warn
  self.bridge = Bridge(delay='50ns', nack_delay='4ns')
  self.physmem = PhysicalMemory(range = AddrRange(mdesc.mem()), zero
 = True)
 +self.diskmem = PhysicalMemory(range = AddrRange(Addr('128MB'),
 size = '128MB'),
 +  file = disk('ael-arm.ext2'))
  self.bridge.side_a = self.iobus.port
  self.bridge.side_b = self.membus.port
  self.physmem.port = self.membus.port
 +self.diskmem.port = self.membus.port
 
  self.mem_mode = mem_mode
 
 @@ -224,7 +227,10 @@
 
  self.intrctrl = IntrControl()
  self.terminal = Terminal()
 -self.boot_osflags = 'earlyprintk mem=128MB console=ttyAMA0
 lpj=19988480 norandmaps'
 +self.kernel = binary('vmlinux.arm')
 +self.boot_osflags = 'earlyprintk mem=128MB console=ttyAMA0
 lpj=19988480' + \
 +' norandmaps
 slram=slram0,0x800,+0x800' +  \
 +' mtdparts=slram0:- rw loglevel=8
 root=/dev/mtdblock0'
 
  return self
 
 diff -r d3c006ecccd3 -r 982b4c6c1470 src/mem/physical.cc
 --- a/src/mem/physical.cc Mon Nov 08 13:58:24 2010 -0600
 +++ b/src/mem/physical.cc Mon Nov 08 13:58:24 2010 -0600
 @@ -31,6 +31,7 @@
 
  #include sys/types.h
  #include sys/mman.h
 +#include sys/user.h
  #include errno.h
  #include fcntl.h
  #include unistd.h
 @@ -41,6 +42,7 @@
  #include string
 
  #include arch/registers.hh
 +#include base/intmath.hh
  #include base/misc.hh
  #include base/random.hh
  #include base/types.hh
 @@ -56,26 +58,39 @@
  PhysicalMemory::PhysicalMemory(const Params *p)
  : MemObject(p), pmemAddr(NULL), pagePtr(0),
lat(p-latency), lat_var(p-latency_var),
 -  cachedSize(params()-range.size()), cachedStart(params()-
 range.start)
 +  _size(params()-range.size()), _start(params()-range.start)
  {
 -if (params()-range.size() % TheISA::PageBytes != 0)
 +if (size() % TheISA::PageBytes != 0)
  panic(Memory Size not divisible by page size\n);
 
  if (params()-null)
  return;
 
 -int map_flags = MAP_ANON | MAP_PRIVATE;
 -pmemAddr = (uint8_t *)mmap(NULL, params()-range.size(),
 -   PROT_READ | PROT_WRITE, map_flags, -1,
 0);
 +
 +if (params()-file == ) {
 +int map_flags = MAP_ANON | MAP_PRIVATE;
 +pmemAddr = (uint8_t *)mmap(NULL, size(),
 +   PROT_READ | PROT_WRITE, map_flags,
 -1, 0);
 +} else {
 +int map_flags = MAP_PRIVATE;
 +int fd = open(params()-file.c_str(), O_RDONLY);
 +_size = lseek(fd, 0, SEEK_END);
 +lseek(fd, 0, SEEK_SET);
 +pmemAddr = (uint8_t *)mmap(NULL, roundUp(size(), PAGE_SIZE),
 +   PROT_READ | PROT_WRITE, map_flags,
 fd, 0);
 +}
 
  if (pmemAddr == (void *)MAP_FAILED) {
  perror(mmap);
 -fatal(Could not mmap!\n);
 +if (params()-file == )
 +fatal(Could not mmap!\n);
 +else
 +fatal(Could not find file: %s\n, params()-file);
  }
 
  //If requested, initialize all the memory to 0
  if (p-zero)
 -memset(pmemAddr, 0, p-range.size());
 +memset(pmemAddr, 0, size());
  }
 
  void
 @@ -94,8 +109,7 @@
  PhysicalMemory::~PhysicalMemory()
  {
  if (pmemAddr)
 -munmap((char*)pmemAddr, params()-range.size());
 -//Remove memPorts?
 +munmap((char*)pmemAddr, size());
  }
 
  Addr
 @@ -408,7 +422,7 @@
  {
  snoop = false;
  resp.clear();
 -resp.push_back(RangeSize(start(), params()-range.size()));
 +resp.push_back(RangeSize(start(), size()));
  }
 
  unsigned
 @@ -463,6 +477,7 @@
  string 

Re: [m5-dev] Implementation of findTagInSet

2010-12-09 Thread Beckmann, Brad
Hi Nilay,

Yes, I believe a machine can be accessed within AST class functions, though I 
don't remember ever doing it myself.  Look at the generate() function in 
TypeFieldEnumAST.  Here you see that the machine (a.k.a StateMachine) is 
grabbed from the symbol table and then different StateMachine functions are 
called on it.  You can imagine adding a new function to StateMachine.py that 
returns whether the TBETable exists.

That seems like it should work to me, but let me know if it doesn't.

Brad



 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay Vaish
 Sent: Thursday, December 09, 2010 5:24 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Implementation of findTagInSet
 
 Hi Brad
 
 Is there way to access the StateMachine object inside any of the AST
 class
 functions? I know the name of the machine can be accessed. But can the
 machine itself be accessed? I need one of the variables in the
 StateMachine object to know whether or not TBETable exists in this
 machine.
 
 
 Nilay
 
 On Wed, 8 Dec 2010, Beckmann, Brad wrote:
 
  Hi Nilay,
 
  I think we can avoid handling pointers in the getState and setState
 functions if we also add bool functions is_cache_entry_valid and
 is_tbe_entry_valid that are implicitly defined in SLICC.  I don't
 think we should try to get rid of getState and setState since they
 often contain valuable, protocol-specific checks in them.  Instead for
 getState and setState, I believe we should simply replace the current
 isTagPresent calls with the new is_*_valid calls.
 
  As far as changePermission() goes, your solution seems reasonable,
 but we may also want to consider just not changing that function at
 all.  ChangePermission() doesn't actually use a cache entry within the
 .sm file, so is doesn't necessarily need to be changed.  Going back to
 breaking this work into smaller portions, that is definitely a portion
 I feel can be pushed to the end or removed entirely.
 
  Brad
 
 
  -Original Message-
  From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay Vaish
  Sent: Wednesday, December 08, 2010 11:53 AM
  To: M5 Developer List
  Subject: Re: [m5-dev] Implementation of findTagInSet
 
  Hi Brad,
 
  A couple of observations
 
  a. If we make use of pointers, would we not need to handle them in
 getState and setState functions?
 
  b. changePermission() seems to be a problem. It would still perform a
 lookup because the fact that a CacheEntry is a locked or not is
 maintained in the CacheMemory object and not with the entry itself. We
 can move that variable to be part of the AbstractCacheEntry or we can
 combine it with the permission variable which is already there in the
 AbstractCacheEntry class. I think lock is only used in the
 implementation of LL/SC instructions.
 
  Nilay
 
 
  On Wed, 8 Dec 2010, Beckmann, Brad wrote:
 
  Hi Nilay,
 
  Breaking the changes into small portions is a good idea, but we
 first need to decide exactly what we are doing.  So far we've only
 thrown out some ideas.  We have not yet to scope out a complete
 solution.  I think we've settled on passing some sort of reference to
 the cache and tbe entries, but exactly whether that is by reference
 variables or pointers isn't clear.  My initial preference is to use
 pointers in the generated code and set the pointers to NULL when a
 cache and/or tbe entry doesn't exist.  However, one thing I really want
 to strive for is to keep pointer manipulation out of the .sm files.
 Writing SLICC code is hard enough and we don't want to burden the SLICC
 programmer with memory management as well.
 
  So how about this plan?
   - Lets remove all the getCacheEntry functions from the slicc files.
 I believe that almost all of these functions look exactly the same and
 it is easy enough for SLICC to just generate them instead.
  - Similarly let get rid of all isCacheTagPresent functions as
 well
   - Then lets replace all the getCacheEntry calls with an implicit
 SLICC supported variable called cache_entry and all the TBEs[addr*]
 calls with an implicit SLICC supported variable called tbe_entry.
 - Underneath these variables can actually be implemented as local
 inlined functions that assert whether the entries are valid and then
 return variables local to the state machine set to the current cache
 and tbe entry.
 - The trigger function will implicitly set these variables
 (pointers underneath) to NULL or valid values, and the only what they
 can be reset is through explicit functions set_cache_entry,
 reset_cache_entry, set_tbe_entry, and reset_tbe_entry.  These
 function would be called by the appropriate actions or possibly be
 merged with the existing check_allocate function.
 
  I think that will give us what we want, but I realize I've just
 proposed changing 100's if not 1000's lines of SLICC code.  I hope that
 these changes are straight forward, but any change like that is never

  1   2   3   >