Re: [gem5-dev] Ruby: Token Coherence and Functional Access
Yes, the token protocol is definitely one of those protocols that prevents us from tightly coupling the functional access support to the protocols. However, I don't this issue will result in silently corrupted behavior. Instead, it seems the result would be an error generated in the simulation, correct? Specifically in the example you mention, all controllers are in the stable Invalid state, right? Therefore, the functional access won't find a valid block anywhere, and an error will be generated. That seems like the right behavior to me. Brad -Original Message- From: gem5-dev-boun...@m5sim.org [mailto:gem5-dev- boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Friday, June 10, 2011 8:50 AM To: gem5-dev@m5sim.org Subject: [gem5-dev] Ruby: Token Coherence and Functional Access Brad, in the token coherence protocol, the l2 cache controller moves from state O to I and sends data to the memory. I think this particular transition is may pose a problem in enabling functional accesses for the protocol. The problem, I think, is that both the directory and the cache controller are in stable states even though there is data travelling in the network. This means that both the controllers will allow a functional write to go ahead. But then the data will be over written by the value sent from the l2 controller to the directory controller. My understanding of the protocol implementation is close to \epsilon. I think this is what I observed today in the morning. Do think this understanding is correct? -- Nilay ___ gem5-dev mailing list gem5-dev@m5sim.org http://m5sim.org/mailman/listinfo/gem5-dev ___ gem5-dev mailing list gem5-dev@m5sim.org http://m5sim.org/mailman/listinfo/gem5-dev
Re: [m5-dev] Move of Garnet/Orion config file
Hi Korey, We are in the process of moving all the Orion code out of Ruby and into McPAT. When that is complete, I suspect that router.cfg file will be removed. Tushar, please correct me if I'm wrong. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Tuesday, May 10, 2011 4:19 PM To: M5 Developer List Subject: [m5-dev] Move of Garnet/Orion config file There is one detail in Garnet/Orion that probably needs to be moved to python. It's the router.cfg file for the Orion stats which is hard coded into the C++ here: m5-ix-link-ruby/src/mem/ruby/network/orion/NetworkPower.cc:79: const string cfg_fn = src/mem/ruby/network/orion/router.cfg The contents of that file is just a bunch of values to be read by Orion when Garnet finishes. As it is now, this forces you to copy this long directory path whenever you are exporting the build directory to run M5 on a cluster of machines. Tushar, is this something that you'd be willing to look into? Someone else? -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Ruby: Correctly set access permissions for directory entries
The stats and regression tester should not need to be updated with this patch. This is purely a Ruby/SLICC mechanism change. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Saturday, May 07, 2011 6:17 AM To: M5 Developer List Subject: Re: [m5-dev] Review Request: Ruby: Correctly set access permissions for directory entries Korey, I don't think there will be any change in the simulation performance. I am not sure about stats. Brad, were the stats updated after you made the change? -- Nilay On Fri, 6 May 2011, Korey Sewell wrote: Nilay, can you explain the impact of that bug in terms of simulation performance? Are benchmarks running slower because of this change? Will regressions need to be updated? On Fri, May 6, 2011 at 8:13 PM, Beckmann, Brad brad.beckm...@amd.comwrote: Hi Nilay, Yeah, pulling the State into the Machine makes sense to me. If I recall, my previous patch made it necessary that each machine included a state_declaration (previously the state enum). More tightly integrating the state to the machine seems to be a natural progression on that path. I understand moving the permission settings back to setState is the easiest way to make this work. However, it would be great if we could combine the setting of state and the setting of permission into one function call from the sm file. Thus we don't have to worry about the situation where one sets the state, but forgets to set the permission. That could lead to some random functional access failing and a very painful debug. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Ruby: Correctly set access permissions for directory entries
Hi Nilay, Yeah, pulling the State into the Machine makes sense to me. If I recall, my previous patch made it necessary that each machine included a state_declaration (previously the state enum). More tightly integrating the state to the machine seems to be a natural progression on that path. I understand moving the permission settings back to setState is the easiest way to make this work. However, it would be great if we could combine the setting of state and the setting of permission into one function call from the sm file. Thus we don't have to worry about the situation where one sets the state, but forgets to set the permission. That could lead to some random functional access failing and a very painful debug. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Friday, May 06, 2011 3:52 PM To: Nilay Vaish; Default Subject: [m5-dev] Review Request: Ruby: Correctly set access permissions for directory entries --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/684/ --- Review request for Default. Summary --- Ruby: Correctly set access permissions for directory entries The access permissions for the directory entries are not being set correctly. This is because pointers are not used for handling directory entries. function. The setState() function will once again set the permissions as well. But it would make use of the State_to_permission() function, instead of doing the analysis it used to do earlier. The changePermission() function provided by the AbstractEntry and AbstractCacheEntry classes has been exposed to SLICC code once again. The set_permission() functionality has been removed. I have done this only for the MESI protocol so far. Once we build a consensus one the changes, I will make changes to other protocols as well. As far as testing is concerned, the protocol compiles and clears 1 loads. I did not test any more than that. A point that I wanted to raise for discussion: I think we should pull State enum and the accompanying functions into the Machine it self. Brad, what do you think? Diffs - src/mem/protocol/MESI_CMP_directory-L1cache.sm 3c628a51f6e1 src/mem/protocol/MESI_CMP_directory-L2cache.sm 3c628a51f6e1 src/mem/protocol/MESI_CMP_directory-dir.sm 3c628a51f6e1 src/mem/protocol/RubySlicc_Types.sm 3c628a51f6e1 src/mem/slicc/ast/MethodCallExprAST.py 3c628a51f6e1 src/mem/slicc/symbols/StateMachine.py 3c628a51f6e1 Diff: http://reviews.m5sim.org/r/684/diff Testing --- Thanks, Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick
I can't reproduce these scons errors and they don't seem to happen from a clean build. Can we blow away the current build directory on zizzer and re-run the regression tester? I would do it myself, but I don't have access to zizzer. Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Cron Daemon Sent: Friday, April 29, 2011 12:17 AM To: m5-dev@m5sim.org Subject: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE/python/m5/internal/param_RubyNetwork_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE/python/m5/internal/param_BaseGarnetNetwork_wrap.cc' . scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE/python/m5/internal/param_Topology_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE/python/m5/internal/param_RubySystem_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE/python/m5/internal/param_GarnetNetwork_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE/python/m5/internal/param_SimpleNetwork_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE/python/m5/internal/param_GarnetNetwork_d_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_RubyNetwo rk_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_BaseGarnet Network_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_Topology_w rap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_RubySystem _wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_GarnetNetw ork_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_SimpleNetw ork_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_hammer/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_hammer/python/m5/internal/param_GarnetNetw ork_d_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_RubyN etwork_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_BaseGa rnetNetwork_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_Topolo gy_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_RubySy stem_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_Garnet Network_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_Simple Network_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MESI_CMP_directory/python/m5/internal/param_Garnet Network_d_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_CMP_directory/python/m5/internal/param_Ruby Network_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_CMP_directory/python/m5/internal/param_Base GarnetNetwork_wrap.cc'. scons: *** Implicit dependency `build/ALPHA_SE_MOESI_CMP_directory/params/ExtLink.hh' not found, needed by target `build/ALPHA_SE_MOESI_CMP_directory/python/m5/internal/param_Topol
Re: [m5-dev] Code Reviewing
I suspect that my recent review posts motivated this thread. Overall, I think that the policy that you suggested Nate has been our informal policy. The reason why I posted my somewhat trivial changes to reviewboard this morning, is to give Tushar a chance to comment on my changes before I pushed them. Also though one of my patches is a single line, it will require a new set of regression tester stats. I felt that that kind of change needed to be highlighted by posting a review. Maybe the best policy is to make better use of the -U and -G options of postreview. I know I'm guilty of not using those options before, but I really should have specified -U tushar for those patches. Right now if one doesn't specify -U or -G, it is sent to the default group (which is all of m5-dev, correct?) and gabe, ali, steve, and nate. Even when -U is specified, it only adds the additional user to the list and doesn't overwrite gabe, ali, steve, and nate. Instead, maybe we still send patches to the default group, but remove the current list of four users. That way we can better customize who are the explicit targets. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Gabriel Michael Black Sent: Wednesday, April 27, 2011 11:25 AM To: m5-dev@m5sim.org Subject: Re: [m5-dev] Code Reviewing That sounds reasonable. With too many reviews it gets harder to get to all of them, and some obscure things may languish with no reviews because only one person is comfortable with that code. Reviews are generally a really good thing but they have some overhead. If we don't get more benefit than that threshold, they aren't worth it in that case. Gabe Quoting nathan binkert n...@binkert.org: Hi Everyone, We don't have an official policy on code reviews, but I think we're being a bit pedantic with them. While I definitely want us to err on the side of having code review is the author has any doubt, I think it is completely unnecessary to have reviews on things like changing comments and text in strings. Similarly, obvious bug fixes (though this is one of those subjective things that the author has to consider) need not be reviewed. What do you all think? What is our policy? Am I crazy? Should we review everything? Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Trace not working
Maybe I'm doing something stupid here, but on a clean checkout, the following short patch encounters the subsequent compiler error: diff --git a/src/mem/SConscript b/src/mem/SConscript --- a/src/mem/SConscript +++ b/src/mem/SConscript @@ -57,6 +57,7 @@ TraceFlag('BusAddrRanges') TraceFlag('BusBridge') TraceFlag('LLSC') +TraceFlag('FlagCheck') TraceFlag('MMU') TraceFlag('MemoryAccess') diff --git a/src/mem/port.cc b/src/mem/port.cc --- a/src/mem/port.cc +++ b/src/mem/port.cc @@ -106,6 +106,7 @@ Port::setPeer(Port *port) { DPRINTF(Config, setting peer to %s\n, port-name()); +DPRINTF(FlagCheck, check setting peer to %s\n, port-name()); peer = port; } Error: scons: Building targets ... [ CXX] X86_SE_MOESI_hammer/mem/port.cc - .do build/X86_SE_MOESI_hammer/mem/port.cc: In member function 'virtual void Port::setPeer(Port*)': build/X86_SE_MOESI_hammer/mem/port.cc:109: error: 'FlagCheck' is not a member of 'Debug' [SWIG] X86_SE_MOESI_hammer/python/m5/internal/vptype_IntLink.i - _wrap.cc, .py [SWIG] X86_SE_MOESI_hammer/python/m5/internal/vptype_AddrRange.i - _wrap.cc, .py scons: *** [build/X86_SE_MOESI_hammer/mem/port.do] Error 1 [SWIG] X86_SE_MOESI_hammer/python/m5/internal/vptype_Process.i - _wrap.cc, .py scons: building terminated because of errors. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of nathan binkert Sent: Monday, April 25, 2011 5:29 PM To: M5 Developer List Subject: Re: [m5-dev] Trace not working However, I am confused as well on how to add a new TraceFlag/DebugFlag. It seems that all the previous flags are still specified using the TraceFlag() function, but I can't seem to be able to specify a new one. Also to be consistent, should we change the name of the TraceFlag function to DebugFlag? You should still use the TraceFlag function in SCons. Are you sure this doesn't work? And yes, I should probably rename TraceFlag to DebugFlag. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] what scons can do
-Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of nathan binkert Sent: Thursday, April 21, 2011 5:53 PM To: M5 Developer List Subject: Re: [m5-dev] what scons can do Maybe so... I think there's a subconscious impression that it takes a while because there's a phase in the build that takes a noticeable amount of time and that's all the output you see. If in fact that delay is 10% running SLICC and 90% scons doing other stuff silently then I agree it's not such a big deal. I think it's more like 1%/99% :) Is that 1%/99% a statement of a clean build for m5.fast? I think the much more common case is you edit one .cc file and rebuild. In that situation, it sure seems like a lot more than 1% of the time is spent by scons regenerating and reanalyzing SLICC files. Whatever it may be, it sure would be great if we could speed things up. I'm happy to help however I can. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Defining Miss Latencies in Ruby
Hi Korey, I'm confused. The miss_latency calculated by the sequencer is the miss latency of the particular request, not just L1 cache hits. If you're seeing a bunch of minimum latency requests, I suspect something else is wrong. For instance, is issued_time a cycle value or a tick value? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Wednesday, April 20, 2011 11:38 AM To: M5 Developer List Subject: [m5-dev] Defining Miss Latencies in Ruby Hi all, I've been working on miss latencies and stats in Ruby Caches and I noticed something that might be a bug in tracking miss stats. The code in Sequencer.cc has the following check for looking at a miss: Time miss_latency = g_eventQueue_ptr-getTime() - issued_time; // Profile the miss latency for all non-zero demand misses if (miss_latency != 0) { track miss stats } Should this not instead be L1_cache_latency or 2 * L1_cache_latency (if it has to be buffered both ways)??? The effect of this I think is a saturation of the miss latency histogram in the 1st bucket. If anyone has any thoughts, let me know, as I could be missing something here ... :) -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Defining Miss Latencies in Ruby
Sure, it is recording all miss latencies, including L1 cache hits. And yes, those L1 hits will show up in the first bucket. However, I don't see how that is a bug. If you don't want to include L1 hits in the histogram, then look how the MOESI_hammer protocol tracks separate miss latencies depending on the responding machine type. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Wednesday, April 20, 2011 7:20 PM To: M5 Developer List Subject: Re: [m5-dev] Defining Miss Latencies in Ruby (comments inline) I'm confused. The miss_latency calculated by the sequencer is the miss latency of the particular request, not just L1 cache hits. I think I understand that, but even if it's just L1 hits, let's say that the L1 latency is 1 and you are running a workload with a high hit rate in the L1s. Then doesnt the code then continuously record that L1 hit in the 1st histogram bucket? This would definitely be the case for L1 latencies of the more than 1, since it's hardcoded to record everything of a latency 0 (basically all requests), right? If you're seeing a bunch of minimum latency requests, I suspect something else is wrong. For instance, is issued_time a cycle value or a tick value? The issued_time is the cycles, as it is set in the makeRequest(), Sequencer function when a new Request is built. -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Ruby: Add support for functional accesses
Hi Nilay, Let me start off by saying that I'm not sure if I fully understand the complexities of dealing with functional accesses from the PioPort and I could be overlooking a key concern. I think implementing functional accesses for the PioPort should be very similar to cpu functional accesses. We still need to include the pio_port within the RubyPort and we need to send all requests not directed at physical memory to it. You need to modify the connectX86RubySystem function in FSConfig.py so that all pio functional requests are seen by the ruby port versus physmem. The more difficult piece maybe moving the address range registration functionality from physmem to RubyPort, since physmem will no longer exist. If you run into difficulties there, I encourage you to send email to the dev list, since others will be better resources than me. Is that the kind of information you were looking for? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, April 14, 2011 9:25 AM To: M5 Developer List Subject: Re: [m5-dev] Review Request: Ruby: Add support for functional accesses Brad, can you elaborate on implementing functional accesses for the PioPort? -- Nilay On Wed, 13 Apr 2011, Beckmann, Brad wrote: I just reviewed it. Please let me know if you have any questions. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, April 12, 2011 4:39 PM To: Default Subject: Re: [m5-dev] Review Request: Ruby: Add support for functional accesses Brad, can you take a look at the patch? I think we are now in position to implement functional accesses for the PioPort. -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Ruby: Add support for functional accesses
I just reviewed it. Please let me know if you have any questions. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, April 12, 2011 4:39 PM To: Default Subject: Re: [m5-dev] Review Request: Ruby: Add support for functional accesses Brad, can you take a look at the patch? I think we are now in position to implement functional accesses for the PioPort. -- Nilay On Tue, 12 Apr 2011, Nilay Vaish wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/611/ --- (Updated 2011-04-12 16:35:34.866577) Review request for Default. Summary (updated) --- Ruby: Add support for functional accesses This patch is meant for aiding discussions on implementation of functional access support in Ruby. ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] AccessPermission in AbstractEntry
Hi Nilay, Yes, that is a good point. We really just need the interface to the permission to be available from AbstractEntry. The variable itself doesn't really need to be there. However, to make that change, you'll need to modify how CacheMemory supports atomics. Could you elaborate on your directory controller question. I suspect that you are right and that only one type of directory controller can exist in a system, but why is that a problem? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Sunday, April 10, 2011 2:12 AM To: m5-dev@m5sim.org Subject: [m5-dev] AccessPermission in AbstractEntry Brad, it seems like the m_Permission variable in AbstractEntry is not being used at all. In order to get AccessPermission for a state, the state_To_AccessPermission function needs to be called. Then, why have that variable? And this would mean that CacheMemory has no idea about the access permission, unless we expose the state to Cache Memory class. Also, as it now stands, it seems one cannot have two different types of directory controllers in a system. Is this correct? If yes, then why this restriction? -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] AccessPermission in AbstractEntry
-Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Monday, April 11, 2011 2:38 PM To: M5 Developer List Subject: Re: [m5-dev] AccessPermission in AbstractEntry Could you elaborate on your directory controller question. I suspect that you are right and that only one type of directory controller can exist in a system, but why is that a problem? Is it not possible that we have a protocol in which different directory controllers may behave differently? Ok, I just had a chance to look at the code and I think you a referring to the lack of a directory MACHINETYPE macro in RubySlicc_ComponentMapping.hh. Is that correct? Ideally, there shouldn't be a problem with adding any arbitrarily named controller to Ruby, as long as you incorporate the right component mapping functions into the protocol. However, in practice I suspect it will take some non-trivial amount of modifications to RubySlicc_ComponentMapping.hh. Also you'll need to be careful how Ruby and generate SLICC code uses the auto generated MachineType functions. There may be some tricky issues there. Overall, I can't really provide you a lot of specifics on why the directory MACHINETYPE macro does not exist. There may have been some assumptions behind that that were relevant in GEMS but are no longer valid in gem5. I would grep through the Ruby and generated code for the MachineType functions to fully understand the ramifications. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Interpreting and Fixing Ruby Stats??
I realize the documentation is still under way for gem5, but I was wondering if there are any plans to document how users should be interpreting the Ruby stats file? (Particularly, the miss latency histograms) Not all protocols support the miss latency histograms. Specifically, I believe only the hammer protocol supports all of them. Do you have any particular questions about them? Did people come to the conclusion that it is a good idea to have a separate files for ruby stats v. m5 stats (if so, sorry for the extra question)? I'm not sure we have a final decision on that, but Derek is the one currently looking into it. Additionally, is there an update on any plans to add descriptions and do stats accounting for the various cache memories? To my surprise, I always get this output for any CacheMemory in Ruby: Cache Stats: system.l1_cntrl0.L1IcacheMemory system.l1_cntrl0.L1IcacheMemory_total_misses: 0 system.l1_cntrl0.L1IcacheMemory_total_demand_misses: 0 system.l1_cntrl0.L1IcacheMemory_total_prefetches: 0 system.l1_cntrl0.L1IcacheMemory_total_sw_prefetches: 0 system.l1_cntrl0.L1IcacheMemory_total_hw_prefetches: 0 For now it looks like I'll have to derive some pseudo-information from the Cache event and transition counts OR go in and try to hack in some of these stats myself. But ideally, I would say one aggregated stat file where I could grep out about cpu and detailed memory stats (i.e. what about mshr miss/hit counts?) would be awesome. If all this stuff is there, please excuse my ignorance, but if not, would someone be so kind to provide a brief update of what's going on with this? The stats you mentioned are supported by some protocols but not others. Basically those protocols that do support them, include special actions that increment these stats. In my opinion, it is really hard to make these stats protocol agnostic, but you're welcome to propose a methodology that does. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Running Ruby w/32 Cores
Hi Korey, Yes, let's move this conversation back to m5-dev, since I think others may be interested and could help. I don't know what the problem is exactly, but at some point of time (probably back in the early GEMS days) I seem to remember the Set code included an assertion check about the 31st bit in 32-bit mode. Therefore, I think we knew about this problem and made sure that never happened. I believe that is why we used to have a restriction that Ruby could only support 16 processors. I'm really fuzzy on the details...maybe someone else can elaborate. In the end, I just want to make sure we add something in the code that makes sure we don't encounter this problem again. This is one of those bugs that can take a while to track down, if you don't catch it right when it happens with an assertion. Brad From: koreylsew...@gmail.com [mailto:koreylsew...@gmail.com] On Behalf Of Korey Sewell Sent: Tuesday, April 05, 2011 7:14 AM To: Beckmann, Brad Subject: Re: [m5-dev] Running Ruby w/32 Cores Hi again Brad, I looked this over again and although my 32-bit patch fixes things, now that I look at it again, I'm not convinced that I actually fixed the symptom of the bug but rather the cause of the bug. Do you happen to know what are the problems with the 32-bit Set counts? Sorry for prolonging the issue, but I thought I had put this to bed but maybe not. Finally, it may not matter that this works on 32-bit machines but it'd be nice if it did. (Let me know if I should move this convo to the m5-dev list) I end up checking the last bit in the count function manually (the code as follows): int Set::count() const { int counter = 0; long mask; for (int i = 0; i m_nArrayLen; i++) { mask = (long)0x01; for (int j = 0; j LONG_BITS; j++) { // FIXME - significant performance loss when array // population LONG_BITS if ((m_p_nArray[i] mask) != 0) { counter++; } mask = mask 1; } #ifndef _LP64 long msb_mask = 0x8000; if ((m_p_nArray[i] msb_mask) != 0) { counter++; } #endif } return counter; } On Tue, Apr 5, 2011 at 1:30 AM, Korey Sewell ksew...@umich.edumailto:ksew...@umich.edu wrote: Brad, it looks like you were right on the money here. I found the spot where it was returning the wrong value via a SLICC function to count sharers for everyone except the owner. I realized that the machine that I use for testing is just a 32-bit machine, and like you warned there look to be issues with the Set type there. I ran the Fft-32 cores on a 64-bit machine and it seems to work correctly. I'll be running on the full splash/parsec suites soon and that should stress Ruby a good bit :). I have a patch that checks to see if _LP64 is defined, and if not check that last bit when doing the set count function. Thanks for being helpful in debugging. It was a relatively easy bug, but as always going through code and becoming more proficient at getting around while trying to solve a bug is really helpful. On Fri, Apr 1, 2011 at 7:28 PM, Beckmann, Brad brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote: Ok for the first trace, the critical line is the following: 348523 0L2Cache L1_GETX ILOSXIFLXO [0x16180, line 0x16180] [NetDest (4) 0 - 1 1 1 1 1 1 1 1 1 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 - 0 0 - 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 - ]30 L2Cache identifies that 31 caches have a shared copy and that L1 cache 9 (L1-9) is the owner. When L1Cache 0 (L1-0) issues a GETX, the L2Cache issues 30 Inv probes, forwards the GETX to L1-9, and sends an ack to L1-0 itself. However, the L2 cache tells L1-0 to expect only 30 acks instead of 31. It could be something wrong with the NetDest::count() function, or the Set::count() function? I slightly modified my previous patch to isolate on what value the NetDest::count() function is returning. If it is returning 30, instead of 31, then it must be a problem with NetDest. You are compiling gem5 as a 64-bit binary, right? The second problem is essentially the same issue. L2Cache 31 (L2-31) is the owner of the block, but I suspect NetDest is not counting bit 31 and thus it is returning a count of 0...causing the error. Overall, concentrate on that NetDest::count function, or more importantly the Set::count() function. Once you find out the problem, please let me know. Thanks, Brad From: koreylsew...@gmail.commailto:koreylsew...@gmail.com [mailto:koreylsew...@gmail.commailto:koreylsew...@gmail.com] On Behalf Of Korey Sewell Sent: Friday, April 01, 2011 12:00 PM To: Beckmann, Brad Subject: Re: [m5-dev] Running Ruby w/32 Cores Brad, attached are the protocol traces grep'd for the offending addresses. I'm going to spend the weekend digging through Ruby code so hopefully I'm pretty close to generating the fixes myself
Re: [m5-dev] ruby_mem_tester.py
Thanks for pointing this out. The hammer protocol included a optimization for uniprocessor DMA that was probably was just too aggressive to be worth the complexity. The optimization broke when I fixed another DMA bug in the protocol last week, but I failed realize that since I offend don't think about uniprocessor scenarios. Rather than try to revive the optimization, I'm just going to remove it. Patch is forthcoming. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Lisa Hsu Sent: Thursday, March 31, 2011 5:39 PM To: M5 Developer List Subject: [m5-dev] ruby_mem_tester.py Hi all, As I prepared to push a bunch of stuff today I found that the following command line fails at the head of the the clean tree: ALPHA_SE_MOESI_hammer/m5.debug configs/example/ruby_mem_test.py -l 1000 --num-dma 2 I pushed my changes anyway because they didn't make any difference on this error, but I've never run ruby_mem_test before, haven't worked with DMA sequencers, and am not particularly cozy with MOESI_hammer, and was wondering if this was known or unknown, expected or unexpected. I presume unknown and unexpected. The error is an invalid transition from MI with event Writeback_Nack. It seems to occur anytime --num-dma 1. Is this a big concern? Should I add this to flyspray? Lisa ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Ruby: Add support for functional accesses
Hi Nilay, Thanks for posting a new patch. I will review it as soon as I can...hopefully tonight. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Wednesday, March 30, 2011 4:32 PM To: Default Subject: Re: [m5-dev] Review Request: Ruby: Add support for functional accesses On Tue, 29 Mar 2011, Nilay Vaish wrote: Brad, I have posted on the review board my current implementation for supporting functional accesses in Ruby. This is untested and is mainly meant for furthering the discussions. I have some questions for you -- 1. How do we inform the other end of RubyPort's M5 Port about whether or not functional access was successful? 2. What's the role of directory memory in functional accesses? 3. If none of the caches have the block pertaining to the address of the access, then read accesses should be satisfied from the physical memory. Write accesses should always go to physical memory as well. How can physical memory be accessed from RubyPort? -- Nilay Brad, I have made some changes to the patch. I have updated it on the review board. I have added a call to sendFunctional() so as to send the response. I have also added call to sendFunctional() on the physical memory port of ruby port, so that the physical memory would also get updated. You had mentioned that we would unhook M5 memory and use Ruby to supply the data. How do we do this? And the second question from the previous mail still remains unanswered. Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Running Ruby w/32 Cores
Hi Korey, For the first trace, it looks like the L2 cache is either miscounting the number of valid L1 copies, or there is an error with the ack arithmetic. We are going to need a bit more information to figure out where the exact problem is. Could you apply the attached patch and reply with the new protocol trace? Thanks. For the second trace, you should be able to get the offending address by simply attaching GDB to the aborted process. Without knowing which address to zero in on, it is the proverbial finding a needle in a haystack. Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Tuesday, March 29, 2011 10:15 AM To: M5 Developer List Subject: Re: [m5-dev] Running Ruby w/32 Cores Thanks for the response Brad. The 1st trace has 1 L2 and the 2nd has 1 L2 (i had a typo in the original email). For each trace, I attach the stdout/stderr (*.out) and then the protocol trace (*.prottrace). Also, in the 1st trace, the offending address is clear and I isolate that in the protocol trace file provided. However, in the 2nd trace, it's unclear (currently) which access caused it to fail so I took the whole protocol trace file and gzip'd it. My current lack of expertise in SLICC limits me a bit, but I'd like to be more helpful in debugging so if there is anything that I can look into (or run) on my end to expedite the process, please advise. In the interim, I'll try to locate the exact address that's breaking trace 2 and then hopefully repost that. Thanks! -Korey On Tue, Mar 29, 2011 at 12:02 PM, Beckmann, Brad brad.beckm...@amd.com wrote: Hi Korey, I believe both of these issues should be easy to solve once we have a protocol trace leading up to the error. If you could create such a trace and send it to the list, that would be great. Just zero in on the offending address. Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Tuesday, March 29, 2011 8:11 AM To: M5 Developer List Subject: [m5-dev] Running Ruby w/32 Cores Hi All, I'm still having a bit of trouble running Ruby with 32+ cores. I am experimenting w/configs varying the l2-caches. The runs seems to generate various errors in the SLICC. Has anybody seen these or have any insight to how to start solving these type of issues (posted below)? = The command line and errors are as follows: (1) 32 Cores and 32 L2s build/ALPHA_FS_MOESI_CMP_directory/m5.opt configs/example/ruby_fs.py -b FftBase32 -n 32 --num-dirs=32 --num- l2caches=32 ... info: Entering event queue @ 0. Starting simulation... Runtime Error at MOESI_CMP_directory-dir.sm:155, Ruby Time: 38279: assert failure, PID: 5990 press return to continue. Program aborted at cycle 19139500 Aborted (2) 32 Cores and 1 L2 build/ALPHA_FS_MOESI_CMP_directory/m5.opt configs/example/ruby_fs.py -b FftBase32 -n 32 --num-dirs=32 --num- l2caches=32 ... fatal: Invalid transition system.l1_cntrl0 time: 349075 addr: [0x16180, line 0x16180] event: Ack state: MM @ cycle 174537500 [doTransitionWorker:build/ALPHA_FS_MOESI_CMP_directory/mem/protoc ol/L1Cache_Transitions.cc, line 477] Memory Usage: 2316756 KBytes For more information see: http://www.m5sim.org/fatal/23f196b2 Please let me know if you do...Thanks! -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Running Ruby w/32 Cores
Hi Korey, I believe both of these issues should be easy to solve once we have a protocol trace leading up to the error. If you could create such a trace and send it to the list, that would be great. Just zero in on the offending address. Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Tuesday, March 29, 2011 8:11 AM To: M5 Developer List Subject: [m5-dev] Running Ruby w/32 Cores Hi All, I'm still having a bit of trouble running Ruby with 32+ cores. I am experimenting w/configs varying the l2-caches. The runs seems to generate various errors in the SLICC. Has anybody seen these or have any insight to how to start solving these type of issues (posted below)? = The command line and errors are as follows: (1) 32 Cores and 32 L2s build/ALPHA_FS_MOESI_CMP_directory/m5.opt configs/example/ruby_fs.py -b FftBase32 -n 32 --num-dirs=32 --num- l2caches=32 ... info: Entering event queue @ 0. Starting simulation... Runtime Error at MOESI_CMP_directory-dir.sm:155, Ruby Time: 38279: assert failure, PID: 5990 press return to continue. Program aborted at cycle 19139500 Aborted (2) 32 Cores and 1 L2 build/ALPHA_FS_MOESI_CMP_directory/m5.opt configs/example/ruby_fs.py -b FftBase32 -n 32 --num-dirs=32 --num- l2caches=32 ... fatal: Invalid transition system.l1_cntrl0 time: 349075 addr: [0x16180, line 0x16180] event: Ack state: MM @ cycle 174537500 [doTransitionWorker:build/ALPHA_FS_MOESI_CMP_directory/mem/protoc ol/L1Cache_Transitions.cc, line 477] Memory Usage: 2316756 KBytes For more information see: http://www.m5sim.org/fatal/23f196b2 Please let me know if you do...Thanks! -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: ruby: fixed cache index setting
Thanks! Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Saturday, March 26, 2011 1:54 PM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: fixed cache index setting I can do it... just wanted to make sure it was expected and not an actual bug. On Sat, Mar 26, 2011 at 1:46 PM, Beckmann, Brad brad.beckm...@amd.com wrote: Hi Steve, Oops. It was such a small change in configuration, I didn't think about rerunning the regression tester, but now thinking about it, yes it could impact the results. The cache indexing functions were not using the right bits before this change. I can go ahead and update the stats tonight. However, let me know if it is more convenient for you to update them yourself. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Saturday, March 26, 2011 6:20 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: fixed cache index setting Hi Brad, Would you expect this to change the results for the ruby regressions slightly? The regressions passed last night because the tests didn't actually get rerun (since scons doesn't see the config file as a dependency), but I'm seeing some failures in the tip on tests I'm running and I suspect it's due to this change. Steve On Fri, Mar 25, 2011 at 10:15 AM, Brad Beckmann brad.beckm...@amd.com wrote: changeset d8587c913ccf in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=d8587c913ccf description: ruby: fixed cache index setting diffstat: configs/ruby/MESI_CMP_directory.py | 17 +++-- configs/ruby/MI_example.py | 4 +++- configs/ruby/MOESI_CMP_directory.py | 17 +++-- configs/ruby/MOESI_CMP_token.py | 15 +-- configs/ruby/MOESI_hammer.py | 10 +++--- 5 files changed, 41 insertions(+), 22 deletions(-) diffs (207 lines): diff -r bbab80b639cb -r d8587c913ccf configs/ruby/MESI_CMP_directory.py --- a/configs/ruby/MESI_CMP_directory.py Fri Mar 25 00:46:14 2011 -0400 +++ b/configs/ruby/MESI_CMP_directory.py Fri Mar 25 10:13:50 2011 -0700 @@ -68,15 +68,19 @@ # Must create the individual controllers before the network to ensure the # controller constructors are called before the network constructor # + l2_bits = int(math.log(options.num_l2caches, 2)) + block_size_bits = int(math.log(options.cacheline_size, 2)) for i in xrange(options.num_cpus): # # First create the Ruby objects associated with this cpu # l1i_cache = L1Cache(size = options.l1i_size, - assoc = options.l1i_assoc) + assoc = options.l1i_assoc, + start_index_bit = block_size_bits) l1d_cache = L1Cache(size = options.l1d_size, - assoc = options.l1d_assoc) + assoc = options.l1d_assoc, + start_index_bit = block_size_bits) cpu_seq = RubySequencer(version = i, icache = l1i_cache, @@ -91,9 +95,7 @@ sequencer = cpu_seq, L1IcacheMemory = l1i_cache, L1DcacheMemory = l1d_cache, - l2_select_num_bits = \ - math.log(options.num_l2caches, - 2)) + l2_select_num_bits = l2_bits) exec(system.l1_cntrl%d = l1_cntrl % i) @@ -103,12 +105,15 @@ cpu_sequencers.append(cpu_seq) l1_cntrl_nodes.append(l1_cntrl) + l2_index_start = block_size_bits + l2_bits + for i in xrange(options.num_l2caches): # # First create the Ruby objects associated with this cpu # l2_cache = L2Cache(size = options.l2_size, - assoc = options.l2_assoc) + assoc = options.l2_assoc, + start_index_bit = l2_index_start) l2_cntrl = L2Cache_Controller(version = i, L2cacheMemory = l2_cache) diff -r bbab80b639cb -r d8587c913ccf configs/ruby/MI_example.py --- a/configs/ruby/MI_example.py Fri Mar 25 00:46:14 2011 - 0400 +++ b/configs/ruby/MI_example.py Fri Mar 25 10:13:50 2011 - 0700 @@ -60,6 +60,7 @@ # Must create the individual controllers before the network to ensure the # controller constructors are called before the network constructor
Re: [m5-dev] Debugging Ruby Deadlocks...
Hi Korey, A few comments: - The difference in time is because the sequencer prints out the RubyCycle count for the issue time while the tick count is the global curTick value. Now that Ruby uses DPRINTFs, I think it makes sense to move all those ruby print outs to ticks. I actually have that on my long todo list, but I'm sure I won't get to it soon. If you want to go ahead and make the conversion, please do. - I'm pretty sure that Invalid range error is completely unrelated. Instead that sort of error is caused when you try to print out a MachineType variable, but the value has not been set. Typically that happens to network messages where the enqueue operations don't fill in all the fields. - Overall, when tracking down a deadlock issue, start with the protocol trace and track down what is happening with the particular address in question. From there, you can typically get an idea of what to zero in on. - By the way, have you had a chance to confirm that my patches from this weekend fixed your previous dma problem? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Wednesday, March 23, 2011 12:43 PM To: M5 Developer List Subject: Re: [m5-dev] Debugging Ruby Deadlocks... This problem may be related to my earlier post i sent today about Debugging Ruby Deadlocks, but when adding the RubyQueue traceflag for 32 cores you get this error: build/ALPHA_FS_MOESI_CMP_directory/m5.opt -d ruby_opt/ --trace- flags=RubyQueue configs/example/ruby_fs.py -b fft_64t_base -n 1 ... panic: Invalid range for type MachineType @ cycle 793500 [MachineType_to_string:build/ALPHA_FS_MOESI_CMP_directory/mem/pro tocol/MachineType.cc, line 42] Memory Usage: 2312860 KBytes For more information see: http://www.m5sim.org/panic/f419fb7 Program aborted at cycle 793500 Aborted I thought this was solved a while ago by a previous patch but it seems to be an issue again. Is it something in the SLICC that isnt being defined properly when the core count is large? If anybody has any thoughts , please let me know so we can patch it and push the changeset in the tree. Note: I dont get this problem when running for just 1 core. -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick
I sent Tushar an email this morning regarding this, hoping to catch him before he went to bed (he's currently in Singapore). Unfortunately he hasn't responded. Hopefully he'll get to this when he wakes up in a few hours. If he doesn't, I'll take a look at it tomorrow morning. I don't have time to do it today. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Gabe Black Sent: Tuesday, March 22, 2011 1:04 PM To: m5-dev@m5sim.org Subject: Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do- regression quick You may already be taking care of this, but networktest.cc also had an error (ambiguous use of the pow function) that made all the regressions fail. That needs to be fixed quickly, regardless of what happens with the warnings or who originally worked on the code. Also, code that doesn't compile should really never have been committed in the first place. It couldn't have been tested since it couldn't have been run. Gabe On 03/22/11 15:46, Nilay Vaish wrote: On Tue, 22 Mar 2011, nathan binkert wrote: The warnings related to networktest.cc got added yesterday. Tushar should take care of the warnings related to networktest.cc. These I think have been around for quite a while. Either way, we should be eliminating warnings. I will commit a patch to eliminate the Sequencer related warnings. -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Ruby: Convert AccessModeType to RubyAccessMode
Hi Nilay, Why do you want to change the name? Both names seem equivalent to me. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Friday, March 18, 2011 9:55 PM To: Nilay Vaish; Default Subject: [m5-dev] Review Request: Ruby: Convert AccessModeType to RubyAccessMode --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/601/ --- Review request for Default. Summary --- Ruby: Convert AccessModeType to RubyAccessMode This patch converts AccessModeType to RubyAccessMode so that both the protocol dependent and independent code uses the same access mode. Diffs - src/cpu/testers/rubytest/Check.hh 9a6a02a235f1 src/cpu/testers/rubytest/Check.cc 9a6a02a235f1 src/mem/protocol/MESI_CMP_directory-msg.sm 9a6a02a235f1 src/mem/protocol/MOESI_CMP_directory-msg.sm 9a6a02a235f1 src/mem/protocol/MOESI_CMP_token-L1cache.sm 9a6a02a235f1 src/mem/protocol/MOESI_CMP_token-dir.sm 9a6a02a235f1 src/mem/protocol/MOESI_CMP_token-msg.sm 9a6a02a235f1 src/mem/protocol/RubySlicc_Exports.sm 9a6a02a235f1 src/mem/protocol/RubySlicc_Types.sm 9a6a02a235f1 src/mem/ruby/profiler/AccessTraceForAddress.hh 9a6a02a235f1 src/mem/ruby/profiler/AccessTraceForAddress.cc 9a6a02a235f1 src/mem/ruby/profiler/AddressProfiler.hh 9a6a02a235f1 src/mem/ruby/profiler/AddressProfiler.cc 9a6a02a235f1 src/mem/ruby/profiler/CacheProfiler.hh 9a6a02a235f1 src/mem/ruby/profiler/CacheProfiler.cc 9a6a02a235f1 src/mem/ruby/profiler/Profiler.hh 9a6a02a235f1 src/mem/ruby/slicc_interface/RubyRequest.hh 9a6a02a235f1 src/mem/ruby/system/CacheMemory.hh 9a6a02a235f1 src/mem/ruby/system/CacheMemory.cc 9a6a02a235f1 src/mem/ruby/system/Sequencer.hh 9a6a02a235f1 src/mem/ruby/system/Sequencer.cc 9a6a02a235f1 Diff: http://reviews.m5sim.org/r/601/diff Testing --- Thanks, Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Ruby: Convert AccessModeType to RubyAccessMode
Nevermind, I understand the reason now. This looks good to me. Thanks, Brad -Original Message- From: Beckmann, Brad Sent: Saturday, March 19, 2011 1:50 PM To: 'M5 Developer List' Subject: RE: [m5-dev] Review Request: Ruby: Convert AccessModeType to RubyAccessMode Hi Nilay, Why do you want to change the name? Both names seem equivalent to me. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Friday, March 18, 2011 9:55 PM To: Nilay Vaish; Default Subject: [m5-dev] Review Request: Ruby: Convert AccessModeType to RubyAccessMode --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/601/ --- Review request for Default. Summary --- Ruby: Convert AccessModeType to RubyAccessMode This patch converts AccessModeType to RubyAccessMode so that both the protocol dependent and independent code uses the same access mode. Diffs - src/cpu/testers/rubytest/Check.hh 9a6a02a235f1 src/cpu/testers/rubytest/Check.cc 9a6a02a235f1 src/mem/protocol/MESI_CMP_directory-msg.sm 9a6a02a235f1 src/mem/protocol/MOESI_CMP_directory-msg.sm 9a6a02a235f1 src/mem/protocol/MOESI_CMP_token-L1cache.sm 9a6a02a235f1 src/mem/protocol/MOESI_CMP_token-dir.sm 9a6a02a235f1 src/mem/protocol/MOESI_CMP_token-msg.sm 9a6a02a235f1 src/mem/protocol/RubySlicc_Exports.sm 9a6a02a235f1 src/mem/protocol/RubySlicc_Types.sm 9a6a02a235f1 src/mem/ruby/profiler/AccessTraceForAddress.hh 9a6a02a235f1 src/mem/ruby/profiler/AccessTraceForAddress.cc 9a6a02a235f1 src/mem/ruby/profiler/AddressProfiler.hh 9a6a02a235f1 src/mem/ruby/profiler/AddressProfiler.cc 9a6a02a235f1 src/mem/ruby/profiler/CacheProfiler.hh 9a6a02a235f1 src/mem/ruby/profiler/CacheProfiler.cc 9a6a02a235f1 src/mem/ruby/profiler/Profiler.hh 9a6a02a235f1 src/mem/ruby/slicc_interface/RubyRequest.hh 9a6a02a235f1 src/mem/ruby/system/CacheMemory.hh 9a6a02a235f1 src/mem/ruby/system/CacheMemory.cc 9a6a02a235f1 src/mem/ruby/system/Sequencer.hh 9a6a02a235f1 src/mem/ruby/system/Sequencer.cc 9a6a02a235f1 Diff: http://reviews.m5sim.org/r/601/diff Testing --- Thanks, Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Ruby FS - DMA Controller problem?
Korey, if you're deadlock is with running the MOESI_CMP_directory protocol, I'm not surprised. DMA support is pretty much broken in that protocol. I have that fixed and I also fixed the underlining DMA problem. I'll be pushing the fixes momentarily. Korey and Malek, please pull these changes and confirm they fix your problem. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Friday, March 18, 2011 9:12 AM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? message below Why did it work before the block size patch? - When the ChuckGenerator sees the block size is 0, it doesn't split up the request into multiple patches and sends the whole dma request at once. That is fine because the DMASequencer splits the request into multiple requests and only responds to the dma port when the entire request is complete. With regards to the old changeset that boots with the block size = 0, I was not able to boot a large scale CMP system (more than 16 cores) due to the deadlock threshold being triggered. I'm assuming that Brad has a read on how to fix that problem so I'll probably start working on what is causing that deadlock so hopefully we can kind of pipeline the bug fixes. -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Ruby FS - DMA Controller problem?
Nevermind those. I had several incoming and outgoing emails from the weekend that finally got through our system. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Sent: Tuesday, March 22, 2011 8:07 PM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? On Sat, March 19, 2011 6:01 pm, Beckmann, Brad wrote: Korey, if you're deadlock is with running the MOESI_CMP_directory protocol, I'm not surprised. DMA support is pretty much broken in that protocol. I have that fixed and I also fixed the underlining DMA problem. I'll be pushing the fixes momentarily. Korey and Malek, please pull these changes and confirm they fix your problem. Brad Brad, how come the mails you sent on Saturday are being received now? -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Ruby FS - DMA Controller problem?
Hi Malek/Korey, The good news is that I've been able to dedicate a significant amount of time to this over the past day or so and I've got a good handle on what is going on here. Why did it work before the block size patch? - When the ChuckGenerator sees the block size is 0, it doesn't split up the request into multiple patches and sends the whole dma request at once. That is fine because the DMASequencer splits the request into multiple requests and only responds to the dma port when the entire request is complete. What is the current problem? - When the ChuckGenerator sees the block size of 64, the dma port splits the request into 64-byte packets, effectively doing the same thing the dma sequencer does. That in itself shouldn't break things...The DMA sequencer nacks all but the first 64-byte request of the dma transfer because it is designed to only handle one M5 packet at a time. Eventually the first 64-byte packet completes and the RubyPort tells the dma port to retry the second packet. The dma port does, but for some reason DMASequencer still nacks that second request. I'm not quite sure why that is, but I'm sure I'll figure it out soon. Once I do, I'll push a fix along with all the other fixes I've come across along this multi-day adventure. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Thursday, March 17, 2011 3:10 PM To: Malek Musleh Cc: M5 Developer List Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? Hi Malek, Can you send your most recent trace showing what you described (if it isnt too big)? I havent observed the different request size errors, but I think I have observed the different PRD addresses on the first access (in the most recent changeset). I'll double check. I was planning to post sometime soon what was the latest on my debugging efforts but a quick summary is that the PRD address gets set from a BMI.DTP register that eventually gets propagate through. I havent been able to verify if that is loaded from the kernel or some configuration parameter quite yet. I have a feeling it might be also linked with the timing simpleCpu changes about handling split requests, although Alpha does not support split requests, that is independent of the DMA transfers. Are you sure it's a split request problem and not an uncacheable address thing? Or maybe it's some combo of both? Also, comparing Ruby Traces (with and without failing changeset) the first PRD BaseAddr is consistent between them, but not consistent between Ruby/M5. So the fact that the PRD BaseAddr is 'wrong' in the one case does not prevent it from booting the Kernel. That's an interesting observation. It would be nice to figure out why that address may or may not matter though. Not really sure if that helps anymore. Malek On Tue, Mar 15, 2011 at 6:50 PM, Korey Sewell ksew...@umich.edu wrote: Sorry for the confusion, I definitely garbled up some terminology. I meant that the M5 ran with the atomic model to compare with the timing Ruby model. M5-atomic maybe runs in 10-15 mins and then Ruby 20-30 mins. I am able to get the problem point in the Ruby simulation (bad DMA access) in about 20 mins. I able to get to that same problem point in the M5-atomic mode in about 10 mins so as to see what to compare against and what values are being set/unset incorrectly. On Tue, Mar 15, 2011 at 6:22 PM, Beckmann, Brad brad.beckm...@amd.com wrote: I'm confused. Korey, I thought this DMA problem only existed with Ruby? If so, how were you able to reproduce it using atomic mode? Ruby does not work with the atomic cpu model. Please clarify, thanks! Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Tuesday, March 15, 2011 12:09 PM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? Hi Brad/Malek, I've been able to regenerate this error in about 20mins now (instead of hours) by running things in atomic mode. Not sure if that helps or not... On Tue, Mar 15, 2011 at 3:03 PM, Beckmann, Brad brad.beckm...@amd.comwrote: How is that you are able to run the memtester in FS Mode? I see the ruby_mem_tester.py in /configs/example/ but it seems that it is only configured for SE Mode as far as Ruby is concerned? I don't run it in FS mode. Since the DMA bug manifests only after hours of execution, I wanted to first verify that the DMA protocol support was solid using the mem tester. Somewhat surprisingly, I found several bugs in MOESI_CMP_directory's support of DMA. It turns out that the initial DMA support in that protocol wasn't very well
Re: [m5-dev] Ruby FS - DMA Controller problem?
How is that you are able to run the memtester in FS Mode? I see the ruby_mem_tester.py in /configs/example/ but it seems that it is only configured for SE Mode as far as Ruby is concerned? I don't run it in FS mode. Since the DMA bug manifests only after hours of execution, I wanted to first verify that the DMA protocol support was solid using the mem tester. Somewhat surprisingly, I found several bugs in MOESI_CMP_directory's support of DMA. It turns out that the initial DMA support in that protocol wasn't very well thought out. Now I fixed those bugs, but since the DMA problem also arises with the MOESI_hammer protocol, I'm confident that my patches don't fix the real problem. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Ruby FS - DMA Controller problem?
I'm confused. Korey, I thought this DMA problem only existed with Ruby? If so, how were you able to reproduce it using atomic mode? Ruby does not work with the atomic cpu model. Please clarify, thanks! Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Tuesday, March 15, 2011 12:09 PM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? Hi Brad/Malek, I've been able to regenerate this error in about 20mins now (instead of hours) by running things in atomic mode. Not sure if that helps or not... On Tue, Mar 15, 2011 at 3:03 PM, Beckmann, Brad brad.beckm...@amd.comwrote: How is that you are able to run the memtester in FS Mode? I see the ruby_mem_tester.py in /configs/example/ but it seems that it is only configured for SE Mode as far as Ruby is concerned? I don't run it in FS mode. Since the DMA bug manifests only after hours of execution, I wanted to first verify that the DMA protocol support was solid using the mem tester. Somewhat surprisingly, I found several bugs in MOESI_CMP_directory's support of DMA. It turns out that the initial DMA support in that protocol wasn't very well thought out. Now I fixed those bugs, but since the DMA problem also arises with the MOESI_hammer protocol, I'm confident that my patches don't fix the real problem. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Ruby FS - DMA Controller problem?
Thanks Malek. Very interesting. Yes, this 5 line changeset seems rather benign, but actually has huge ramifications. With this change, the RubyPort passes the correct block size to the cpu/device models. Without it, I believe the block size defaults to 0 or 1...I can't remember which. While that seems rather inconsequential, I noticed when I made this change that the memtester behaved quite differently. In particular, it keeps issuing requests until sendTiming returns false, instead of just one request/cpu at a time. Therefore another patch in this series added the retry mechanism to the RubyPort. I'm still not sure exactly what the problem is with ruby+dma, but I suspect that the dma devices are behaving differently now that the RubyPort passes the correct block size. I was able to spend a few hours on this over the weekend. I am now able to reproduce the error and I have a few protocol bug fixes queued up. However, I don't think those fixes actually solved the main issue. I don't think I'll be able to get to it today, but I'll try to find some time tomorrow to investigate further. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Monday, March 14, 2011 2:10 AM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? Which lines are you commenting out to get it to work? It's a bit unclear in the diff you point to (maybe because you said it's a full set of changes, not just one) (btw: The work I've been doing is comparing the old m5 memory trace to the gem5 memory trace to try to chase down the bug. I wouldn't be surprised if we are converging to the same bug though.) On Mon, Mar 14, 2011 at 3:51 AM, Malek Musleh malek.mus...@gmail.com wrote: Hi Brad, I found the problem that was causing this error. Specifically, it is this changeset: changeset: 7909:eee578ed2130 user: Joel Hestness hestn...@cs.utexas.edu date: Sun Feb 06 22:14:18 2011 -0800 summary: Ruby: Fix to return cache block size to CPU for split data transfers Link: http://reviews.m5sim.org/r/393/diff/#index_header Previously, I mentioned it was a couple of changesets prior to this one, but the changes between them are related, so it wasn't as obvious what was happening. In fact, this corresponds to the assert() for the block size you had put in to deal with x86 unaligned accesses, but then later removed because of LL/SC in Alpha. It's not clear to me why this is causing a problem, or rather why this doesn't return the default 64 byte block size from the ruby system, but commenting out those lines of code allowed it to work. Maybe Korey could confirm? Malek On Wed, Mar 9, 2011 at 8:24 PM, Beckmann, Brad brad.beckm...@amd.com wrote: I still have not been able to reproduce the problem, but I haven't tried in a few weeks. So does this happen when booting up the system, independent of what benchmark you are running? If so, could you send me your command line? I'm sure the disk image and kernel binaries between us are different, so I don't necessarily think I'll be able to reproduce your problem, but at least I'll be able to isolate it. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Malek Musleh Sent: Wednesday, March 09, 2011 4:41 PM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? Hi Korey, I ran into a similar problem with a different benchmark/boot up attempt. There is another thread on m5-dev with 'Ruby FS failing with recent changesets' as the subject. I was able to track down the changeset which it was coming from, but I did not look further into the changeset as to why it was causing it. Brad said he would take a look at it, but I am not sure if he was able to reproduce the problem. Malek On Wed, Mar 9, 2011 at 7:08 PM, Korey Sewell ksew...@umich.edu wrote: Hi all, I'm trying to run Ruby in FS mode for the FFT benchmark. However, I've been unable to fully boot the kernel and error with a panic in the IDE disk controller: panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1 @ cycle 62640732569001 [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc, line 323] Has anybody run into a similar error or does anyone have any suggestions for debugging the problem? I can run the same code using the M5 memory system and FFT finishes properly so it's definitely a ruby-specific thing. It seems to track this down , I could diff instruction traces (M5 v. Ruby) or maybe even diff trace output from the IdeDisk trace flags but those routes seem a bit heavy-handed considering the amount of trace output generated. The command line this was run with is: build/ALPHA_FS_MOESI_CMP_directory
Re: [m5-dev] Ruby FS - DMA Controller problem?
Hi Malek, Just to reiterate, I don't think my patches will fix the underlining problem. Instead, my patches just fix various corner cases in the protocols. I suspect these corner cases are never actually reached in real execution. The fact that your dma traces point out that the Ruby and Classic configurations use different base addresses makes me think this might be a problem with configuration and device registration. We should investigate further. Brad -Original Message- From: Malek Musleh [mailto:malek.mus...@gmail.com] Sent: Monday, March 14, 2011 9:11 AM To: M5 Developer List Cc: Beckmann, Brad Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? Hi Korey/Brad, I commented out the following lines: In RubyPort.hh unsigned deviceBlockSize() const; In RubyPort.cc unsigned RubyPort::M5Port::deviceBlockSize() const { return (unsigned) RubySystem::getBlockSizeBytes(); } I also did a diff trace between M5 and Ruby using the IdeDisk traceflag as indicated earlier on. In the Ruby Trace, it stalls at this 2398589225000: system.disk0: Write to disk at offset: 0x1 data 0 239858940: system.disk0: Write to disk at offset: 0x2 data 0x10 2398589575000: system.disk0: Write to disk at offset: 0x3 data 0 2398589742000: system.disk0: Write to disk at offset: 0x4 data 0 2398589909000: system.disk0: Write to disk at offset: 0x5 data 0 2398590088000: system.disk0: Write to disk at offset: 0x6 data 0xe0 2398596763500: system.disk0: Write to disk at offset: 0x7 data 0xc8 2398597916500: system.disk0: PRD: baseAddr:0x87298000 (0x7298000) byteCount:8192 (16) eot:0x8000 sector:0 2398597916500: system.disk0: doDmaWrite, diskDelay: 100 totalDiskDelay: 116 Waiting for the Interrupt to be Posted. However, a comparison between the M5 and Ruby traces suggest that they differ on the following line: RubyTrace: 239858940: system.disk0: Write to disk at offset: 0x2 data 0x10 2398589575000: system.disk0: Write to disk at offset: 0x3 data 0 2398589742000: system.disk0: Write to disk at offset: 0x4 data 0 2398589909000: system.disk0: Write to disk at offset: 0x5 data 0 2398590088000: system.disk0: Write to disk at offset: 0x6 data 0xe0 2398596763500: system.disk0: Write to disk at offset: 0x7 data 0xc8 2398597916500: system.disk0: PRD: baseAddr:0x87298000 (0x7298000) byteCount:8192 (16) eot:0x8000 sector:0 2398597916500: system.disk0: doDmaWrite, diskDelay: 100 totalDiskDelay: 116 M5 Trace: 2237623634000: system.disk0: Write to disk at offset: 0x7 data 0xc8 2237624206501: system.disk0: PRD: baseAddr:0x87392000 (0x7392000) byteCount:8192 (16) eot:0x8000 sector:0 2237624206501: system.disk0: doDmaWrite, diskDelay: 100 totalDiskDelay: 116 If you note that the PRD:baseAddr it tries to access is different, which I would think should be the same right? There is no reason why it should be different? The 0 or 1 block size, and the sequential retries are forcing the DMA timer to time out the request, and thus fails in the dma inconsistent state. I have attached both sets of traces in case it sheds anymore light on to the cause of the problem. In any case, it might not matter too much now since Brad was able to reproduce the problem and has a patch for it, but may be of use for future M5 changes. Malek On Mon, Mar 14, 2011 at 11:54 AM, Beckmann, Brad brad.beckm...@amd.com wrote: Thanks Malek. Very interesting. Yes, this 5 line changeset seems rather benign, but actually has huge ramifications. With this change, the RubyPort passes the correct block size to the cpu/device models. Without it, I believe the block size defaults to 0 or 1...I can't remember which. While that seems rather inconsequential, I noticed when I made this change that the memtester behaved quite differently. In particular, it keeps issuing requests until sendTiming returns false, instead of just one request/cpu at a time. Therefore another patch in this series added the retry mechanism to the RubyPort. I'm still not sure exactly what the problem is with ruby+dma, but I suspect that the dma devices are behaving differently now that the RubyPort passes the correct block size. I was able to spend a few hours on this over the weekend. I am now able to reproduce the error and I have a few protocol bug fixes queued up. However, I don't think those fixes actually solved the main issue. I don't think I'll be able to get to it today, but I'll try to find some time tomorrow to investigate further. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Monday, March 14, 2011 2:10 AM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? Which lines are you commenting out to get it to work? It's a bit unclear in the diff you point to (maybe because you said it's
Re: [m5-dev] Functional Interface in Ruby
You probably already realize this, but I want to point out that the topology needs pointers to all the controllers. I don't have the code in front of me, but if I recall correctly, topology is then a member of the network. If you move the controllers underneath RubySystem and if RubySystem keeps its pointer to the network, then a cycle exists. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Saturday, March 12, 2011 9:32 PM To: M5 Developer List Subject: Re: [m5-dev] Functional Interface in Ruby On Sat, Mar 12, 2011 at 5:45 PM, Nilay Vaish ni...@cs.wisc.edu wrote: On Sat, 12 Mar 2011, Steve Reinhardt wrote: Can't we loop through the directory controllers in python to calculate the total size, then pass that size as a parameter to RubySystem? There's no reason for the C++ RubySystem object to need the directory controller pointers just to do that calculation. It is being done in Python script. We were thinking of passing RubySystem object to the Network. But RubySystem cannot be created before directory controllers are created. And the reason for these changes is to pass RubySystem object to the controllers. I'm still confused... the python objects can be created in any order, and parameter values can be set at any time and in any order, up until the instantiate() call. The acyclic dependency issue only affects the creation of C++ objects in instantiate(). So I don't see how this is relevant. I would like to access cache controllers from RubySystem parameter object in C++. If we do allow such access, then we would not have any cycle in the graph. We only need to create controllers, then network and then RubySystem in Python. If controllers are visible to RubySystem as members of the RubySystem parameter object, then we can create the list of cache memories by probing each controller object. Yea, I can see that even though that's not the m5 idiom, and is a little less convenient since the python code has to explicitly build this list instead of having it happen implicitly, that it fits better with the way RubySystem is currently built up. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Functional Interface in Ruby
In the short run, I think the easiest way to break the cycle is to have the network take the RubySystem object as a parameter instead of the other way around, then add a registerNetwork() callback on RubySystem to let the network give the system its pointer. ... Finally, it occurs to me that we avoid these issues to some extent in the classic m5 memory hierarchy by using ports rather than parameters to set up inter-object connections; maybe we should consider extending or adapting that model to Ruby someday. I think Steve's short-term solution is a good one. However I'm not sure if Ruby always using ports would solve thr problem. The connections that Nilay is trying to set up, a system-level list of all caches and memory objects, are not real. It is completely fake. I'm not sure that ports really fit that model. Instead, it seems like the crux of the problem is that we want to set up this this list in C++ because it doesn't make sense to explicitly set up these connections in the python file. I'm not sure if there is a perfect solution. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Ruby: Get rid of the dead ruby tester.
Gabe, thanks for putting this patch out for review. I had forgotten that this directory still exists. I moved the code that I'm most familiar with out of this directory last year, but I didn't touch the Racey tester code because I wasn't sure what to do with it. I believe that code was written by Min Xu several years ago to test his flight data recorder. Subsequently we used to use it for general testing because it tended to find certain bugs much faster than the standard random tester. That being said, I suspect that code hasn't been used in 5+ years and at some point we need to have a timeout and just delete it. Unless the folks at Wisconsin prefer otherwise, I'm completely fine with deleting the whole directory. Regardless, the DeterministicDriver files should definitely be deleted. That functionality now exists in the directedtest directory. I should have deleted them in my changeset from last year. By the way, this reminds me that the directed test code is another piece that should be added to the regression tester. I'll add that to my list. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Thursday, March 10, 2011 11:10 AM To: Gabe Black Cc: Default; Ali Saidi Subject: Re: [m5-dev] Review Request: Ruby: Get rid of the dead ruby tester. I don't think it's dead, just sleeping... I'm not sure why it's not compilable right now (I thought it was usable), but I'd rather just fix that up than whack the code. We definitely need some input from Brad or the Wisconsin folks before making this change. Steve On Thu, Mar 10, 2011 at 11:03 AM, Gabe Black gbl...@eecs.umich.edu wrote: This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/555/ Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. By Gabe Black. Description Ruby: Get rid of the dead ruby tester. None of the code in the ruby tester directory is compiled or referred to outside of that directory. This change eliminates it. If it's needed in the future, it can be revived from the history. In the mean time, this removes clutter and the only use of the GEMS_ROOT scons variable. Diffs - src/mem/ruby/tester/DeterministicDriver.hh (77aa0f94e7f2) - src/mem/ruby/tester/DeterministicDriver.cc (77aa0f94e7f2) - src/mem/ruby/tester/RaceyDriver.hh (77aa0f94e7f2) - src/mem/ruby/tester/RaceyDriver.cc (77aa0f94e7f2) - src/mem/ruby/tester/RaceyPseudoThread.hh (77aa0f94e7f2) - src/mem/ruby/tester/RaceyPseudoThread.cc (77aa0f94e7f2) - src/mem/ruby/tester/SConscript (77aa0f94e7f2) - src/mem/ruby/tester/SpecifiedGenerator.hh (77aa0f94e7f2) - src/mem/ruby/tester/SpecifiedGenerator.cc (77aa0f94e7f2) - src/mem/ruby/tester/Tester_Globals.hh (77aa0f94e7f2) - src/mem/ruby/tester/main.hh (77aa0f94e7f2) - src/mem/ruby/tester/main.cc (77aa0f94e7f2) - src/mem/ruby/tester/test_framework.hh (77aa0f94e7f2) - src/mem/ruby/tester/test_framework.cc (77aa0f94e7f2) View Diff http://reviews.m5sim.org/r/555/diff/ ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Ruby FS - DMA Controller problem?
I still have not been able to reproduce the problem, but I haven't tried in a few weeks. So does this happen when booting up the system, independent of what benchmark you are running? If so, could you send me your command line? I'm sure the disk image and kernel binaries between us are different, so I don't necessarily think I'll be able to reproduce your problem, but at least I'll be able to isolate it. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Malek Musleh Sent: Wednesday, March 09, 2011 4:41 PM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS - DMA Controller problem? Hi Korey, I ran into a similar problem with a different benchmark/boot up attempt. There is another thread on m5-dev with 'Ruby FS failing with recent changesets' as the subject. I was able to track down the changeset which it was coming from, but I did not look further into the changeset as to why it was causing it. Brad said he would take a look at it, but I am not sure if he was able to reproduce the problem. Malek On Wed, Mar 9, 2011 at 7:08 PM, Korey Sewell ksew...@umich.edu wrote: Hi all, I'm trying to run Ruby in FS mode for the FFT benchmark. However, I've been unable to fully boot the kernel and error with a panic in the IDE disk controller: panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1 @ cycle 62640732569001 [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc, line 323] Has anybody run into a similar error or does anyone have any suggestions for debugging the problem? I can run the same code using the M5 memory system and FFT finishes properly so it's definitely a ruby-specific thing. It seems to track this down , I could diff instruction traces (M5 v. Ruby) or maybe even diff trace output from the IdeDisk trace flags but those routes seem a bit heavy-handed considering the amount of trace output generated. The command line this was run with is: build/ALPHA_FS_MOESI_CMP_directory/m5.opt configs/example/ruby_fs.py -b fft_64t_base -n 1 The output in system.terminal is: hda: M5 IDE Disk, ATA DISK drive hdb: M5 IDE Disk, ATA DISK drive hda: UDMA/33 mode selected hdb: UDMA/33 mode selected hdc: M5 IDE Disk, ATA DISK drive hdc: UDMA/33 mode selected ide0 at 0x8410-0x8417,0x8422 on irq 31 ide1 at 0x8418-0x841f,0x8426 on irq 31 ide_generic: please use probe_mask=0x3f module parameter for probing all legacy ISA IDE ports ide2 at 0x1f0-0x1f7,0x3f6 on irq 14 ide3 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: 2866752 sectors (1467 MB), CHS=2844/16/63 hda:4hda: dma_timer_expiry: dma status == 0x65 hda: DMA interrupt recovery hda: lost interrupt unknown partition table hdb: max request size: 128KiB hdb: 1008000 sectors (516 MB), CHS=1000/16/63 hdb:4hdb: dma_timer_expiry: dma status == 0x65 hdb: DMA interrupt recovery hdb: lost interrupt Thanks again, any help or thoughts would be well appreciated. -- - Korey ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Functional Interface in Ruby
I believe the L1DcacheMemory is created right after system because inside each protocol file the first thing attached to the system is the l1 controllers. That way the controllers get a more descriptive name than what they are as related to the topology. I'm still a little confused by the cycle error. If the parent.any call searches the graph for the close object of that particular type, wouldn't you always get a cycle using parent.any? Or are other uses of parent.any more of an uncle search than a true parent search? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Wednesday, March 09, 2011 5:22 PM To: M5 Developer List Subject: Re: [m5-dev] Functional Interface in Ruby It seems odd that it tries to create L1DcacheMemory right after it creates system. Can you add print statements like in this patch and see what it shows? diff --git a/src/python/m5/SimObject.py b/src/python/m5/SimObject.py --- a/src/python/m5/SimObject.py +++ b/src/python/m5/SimObject.py @@ -843,8 +843,11 @@ # Call C++ to create C++ object corresponding to this object def createCCObject(self): +print Creating, self, params self.getCCParams() +print Creating, self self.getCCObject() # force creation +print Done creating, self def getValue(self): return self.getCCObject() On Wed, Mar 9, 2011 at 2:34 PM, Nilay Vaish ni...@cs.wisc.edu wrote: Creating root Creating system.physmem Creating system Creating system.l1_cntrl0.L1DcacheMemory Creating system.ruby Creating system.ruby.network Creating system.ruby.network.topology Creating system.ruby.network.topology.ext_links0 Creating system.l1_cntrl0 Creating system.l1_cntrl0.L1DcacheMemory This is the output I obtained from SimObject.py, clearly there is a cycle. Should not the cache controllers be part of ruby, instead of being part of system? Once they become part of ruby, it should be possible to traverse the controller array and figure out all the caches. Nilay On Wed, 9 Mar 2011, Steve Reinhardt wrote: I think you're looking in the wrong place... you want to look at getCCObject() in src/python/m5/SimObject.py where the error message is coming from, and see if you can add some print statements there. Steve On Wed, Mar 9, 2011 at 11:27 AM, Nilay Vaish ni...@cs.wisc.edu wrote: What exactly happens on the function call Param.RubySystem(Parent.any, Ruby System) ? Nilay On Wed, 9 Mar 2011, Steve Reinhardt wrote: Does the RubySystem object have a pointer to a RubyCache object? You could also go into the python code and add some print statements to get a clue about where the cycle is occurring. Steve On Wed, Mar 9, 2011 at 4:51 AM, Nilay ni...@cs.wisc.edu wrote: Brad, given current versions of MESI_CMP_directory.py and Ruby.py, the following change to the way cache memory is added to the system creates a loop. What am I missing here? class RubyAbstractMemory(SimObject): type = 'RubyAbstractMemory' cxx_class = 'AbstractMemory' system = Param.RubySystem(Parent.any,Ruby System); class RubyCache(RubyAbstractMemory): type = 'RubyCache' cxx_class = 'CacheMemory' size = Param.MemorySize(capacity in bytes); latency = Param.Int(); assoc = Param.Int(); replacement_policy = Param.String(PSEUDO_LRU, ); start_index_bit = Param.Int(6, index start, default 6 for 64-byte line); -- Nilay ___ ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Functional Interface in Ruby
Great. It sounds like we are thinking of a similar solution. Just one thing I want to point out is AbstractController may not be the right place to build the list. As you know, sometimes a controller may manage multiple cachememory objects and other controllers may not manage any cachememory or directorymemory objects. Instead, you may want to consider creating a separate RubyStorage class that builds the list from which both CacheMemory and DirectoryMemory inherit. I'll leave it up to you to decide which is easier. Also we don't want to further inhibit ourselves from creating multiple Ruby systems in the same simulation. (I understand there may be other issues that currently prevent us from doing that.) Therefore, instead of using a static function, we can build the list on a per RubySystem basis. The cachememory and directorymemory objects should be able to get a pointer to their associated RubySystem using the Parent.any directive in their .py file. See the following line in sim/System.py for an example 'physmem = Param.PhysicalMemory(Parent.any, physical memory)'. Brad -Original Message- From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Tuesday, March 08, 2011 3:22 AM To: Beckmann, Brad Cc: m5-dev@m5sim.org Subject: RE: Functional Interface in Ruby It seems that this will work out. We can make AbstractController call a static function of RubyPort class that will add the calling object to some list which will be accessed while making functional accesses. As far as pushing functional access support to Sequencer in to concerned, there was no particular reason for that. Since Sequencer handles that timing acesses, I thought that should be the file that would contain the code for functional accesses. I am fine with functional access code going in to RubyPort. -- Nilay On Mon, 7 Mar 2011, Beckmann, Brad wrote: Hi Nilay, Please excuse the slow response. I've been meaning to reply to this email for a few days. Absolutely, we will need to maintain some sort of list of all cachememory and directorymemory objects to make the functional access support work. However, I'm not sure if we'll need to modify the protocol python files. Instead, could we create a list of these objects through their c++ constructors similar to how the SimObject list is created? Also, I know the line between the RubyPort and Sequencer is quite blurry, but is there a particular reason to push the functional access support into the Sequencer? It seems that the RubyPort would be a more natural location. Brad -Original Message- From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Friday, March 04, 2011 9:49 AM To: Beckmann, Brad Cc: m5-dev@m5sim.org Subject: Functional Interface in Ruby I have been thinking about how to make Ruby support functional accesses. It seems some where we will have to add support so that either RubyPort or Sequencer can view all other caches. I am currently leaning towards adding it to the sequencer. I think this can be done by editing protocol files in configs/ruby. And then RubyPort can pass on functional accesses to the Sequencer, which will look up all the caches and take the correct action. I think this can be made to work. Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Functional Interface in Ruby
Hi Nilay, It looks like my email filter of the m5-dev list cause me to basically send you the same suggestion that Steve sent you. Sorry for the confusion, but it is good to know that Steve and I at least are considering the same problem. From now on, let's drop our individual email addresses and just direct our responses to m5-dev. Brad From: Steve Reinhardt [mailto:ste...@gmail.com] Sent: Tuesday, March 08, 2011 7:18 AM To: M5 Developer List Cc: Nilay Vaish; Beckmann, Brad Subject: Re: [m5-dev] Functional Interface in Ruby Forgot to mention that this is how we handle registering all the thread contexts within a system... you can look at that code (in the CPU models and in System) for an example. On Tue, Mar 8, 2011 at 7:16 AM, Steve Reinhardt ste...@gmail.commailto:ste...@gmail.com wrote: Sorry I missed this thread... I just read Nilay's response about python issues and he pointed me over here. One thing we should think about is that we really only want the caches within a single system to be flushed at once... I know that it's unlikely that anyone will want to model two systems with detailed memory models at once, and I vaguely recall there were other issues with Ruby not really supporting multiple instances of itself, but I don't want to see us make things less modular than they already are. The m5 idiom for doing this is: - add a parameter to each cache/controller/whatever we want to track like this: system = Param.System(Parent.any, system object) - add a method to the System object like registerCache(Cache *c) that adds c to the system object's list of caches - Have each cache constructor call p-system-registerCache(this) to register itself Would something like this work for what you're trying to do? Steve On Tue, Mar 8, 2011 at 3:21 AM, Nilay Vaish ni...@cs.wisc.edumailto:ni...@cs.wisc.edu wrote: It seems that this will work out. We can make AbstractController call a static function of RubyPort class that will add the calling object to some list which will be accessed while making functional accesses. As far as pushing functional access support to Sequencer in to concerned, there was no particular reason for that. Since Sequencer handles that timing acesses, I thought that should be the file that would contain the code for functional accesses. I am fine with functional access code going in to RubyPort. -- Nilay On Mon, 7 Mar 2011, Beckmann, Brad wrote: Hi Nilay, Please excuse the slow response. I've been meaning to reply to this email for a few days. Absolutely, we will need to maintain some sort of list of all cachememory and directorymemory objects to make the functional access support work. However, I'm not sure if we'll need to modify the protocol python files. Instead, could we create a list of these objects through their c++ constructors similar to how the SimObject list is created? Also, I know the line between the RubyPort and Sequencer is quite blurry, but is there a particular reason to push the functional access support into the Sequencer? It seems that the RubyPort would be a more natural location. Brad -Original Message- From: Nilay Vaish [mailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edu] Sent: Friday, March 04, 2011 9:49 AM To: Beckmann, Brad Cc: m5-dev@m5sim.orgmailto:m5-dev@m5sim.org Subject: Functional Interface in Ruby I have been thinking about how to make Ruby support functional accesses. It seems some where we will have to add support so that either RubyPort or Sequencer can view all other caches. I am currently leaning towards adding it to the sequencer. I think this can be done by editing protocol files in configs/ruby. And then RubyPort can pass on functional accesses to the Sequencer, which will look up all the caches and take the correct action. I think this can be made to work. Nilay ___ m5-dev mailing list m5-dev@m5sim.orgmailto:m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Functional Interface in Ruby
Hi Nilay, Please excuse the slow response. I've been meaning to reply to this email for a few days. Absolutely, we will need to maintain some sort of list of all cachememory and directorymemory objects to make the functional access support work. However, I'm not sure if we'll need to modify the protocol python files. Instead, could we create a list of these objects through their c++ constructors similar to how the SimObject list is created? Also, I know the line between the RubyPort and Sequencer is quite blurry, but is there a particular reason to push the functional access support into the Sequencer? It seems that the RubyPort would be a more natural location. Brad -Original Message- From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Friday, March 04, 2011 9:49 AM To: Beckmann, Brad Cc: m5-dev@m5sim.org Subject: Functional Interface in Ruby I have been thinking about how to make Ruby support functional accesses. It seems some where we will have to add support so that either RubyPort or Sequencer can view all other caches. I am currently leaning towards adding it to the sequencer. I think this can be done by editing protocol files in configs/ruby. And then RubyPort can pass on functional accesses to the Sequencer, which will look up all the caches and take the correct action. I think this can be made to work. Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Testing Functional Access
Hi Nilay, I would suggest a few different tests. The first one would be to run a simple binary under Alpha SE mode using Ruby. You should first observe a bunch of functional accesses that initialize memory and then (if I recall correctly) dynamic accesses will load the TLB. After passing that test, I would try loading a SE checkpoint and running. After that, I would move on to similar tests using FS mode. I hope that helps. Please let me know if you have any specific questions. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Sent: Tuesday, March 01, 2011 6:51 AM To: m5-dev@m5sim.org Subject: [m5-dev] Testing Functional Access How can I test whether or not functional accesses to the memory are working correctly? Do we have some regression test for this? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Testing Functional Access
I forgot that the memtester includes functional accesses. That is a good suggestion, especially when it comes to testing the situations where Ruby can't satisfy the functional access due to contention with timing accesses. The memtester does run with Ruby (it actually runs every night in the regression tester), however the percentage of functional accesses is currently set to zero. See configs/example/ruby_mem_test.py. You'll obviously want to change that and include code within src/cpu/testers/memtest/* to handle failed functional accesses. If you don't want to initially deal with the failure situations, you can set the functional access percentage to 100% and that should always work. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Tuesday, March 01, 2011 10:49 AM To: M5 Developer List Subject: Re: [m5-dev] Testing Functional Access The m5 memtester supports functional accesses (there's a percent_functional parameter on the MemTest object). I don't know if anyone's run the memtester with Ruby though. Seems like it should work. Steve On Tue, Mar 1, 2011 at 8:39 AM, Joel Hestness hestn...@cs.utexas.eduwrote: Hi Nilay, I don't know if there is a regression for it, but the M5 utility (./util/m5/) sets up functional accesses to memory. For instance, in FS, if you specify an rcS script to fs.py and call % /sbin/m5 readfile from the command line of the simulated system, it will read the specified rcS file off the host machine's disk and send it to the memory of the simulated system using functional accesses. I think there are other functional access examples in the magic that the M5 utility provides. Hope this helps, Joel On Tue, Mar 1, 2011 at 8:51 AM, Nilay ni...@cs.wisc.edu wrote: How can I test whether or not functional accesses to the memory are working correctly? Do we have some regression test for this? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Joel Hestness PhD Student, Computer Architecture Dept. of Computer Science, University of Texas - Austin http://www.cs.utexas.edu/~hestness ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Ruby: Fix DPRINTF bugs in PerfectSwitch and MessageBuffer
Hi Nilay, In the future, feel free to directly check in these sort of minor bug fixes. Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, March 01, 2011 1:32 PM To: Nilay Vaish; Default Subject: [m5-dev] Review Request: Ruby: Fix DPRINTF bugs in PerfectSwitch and MessageBuffer --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/505/ --- Review request for Default. Summary --- At a couple of places in PerfectSwitch.cc and MessageBuffer.cc, DPRINTF() has not been provided with correct number of arguments. The patch fixes these bugs. Diffs - src/mem/ruby/buffers/MessageBuffer.cc UNKNOWN src/mem/ruby/network/simple/PerfectSwitch.cc UNKNOWN Diff: http://reviews.m5sim.org/r/505/diff Testing --- Thanks, Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Functional Access support in Ruby
Hi Nilay, What exactly are you referring to as the underlying processor? Are you referring to real silicon? Actual hardware doesn't support functional accesses. Functional accesses are unique to gem5 and are completely fake when compared to actual hardware. gem5 could support functional accesses by quiescing the system and then perform the read or write using the existing timing path. That would probably be a suitable solution if gdb running on the simulated system was the only source of dynamic functional accesses. However, there are other sources of dynamic functional accesses and we don't want to always perturb the system when performing those accesses. Thus we need a backdoor that doesn't perturb the system. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Saturday, February 26, 2011 9:06 AM To: M5 Developer List Subject: Re: [m5-dev] Functional Access support in Ruby I was thinking about the behavior of functional accesses. Currently in gdb we can change the value of a program variable. Does that mean the underlying processor supports functional accesses? If yes, then we should already have some knowledge about what is expected from functional accesses. Nilay On Fri, 25 Feb 2011, Beckmann, Brad wrote: Yes, that is correct. The RubyPort::M5Port::recvFunctional() function is where we need to add the new support. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Friday, February 25, 2011 12:20 PM To: m5-dev@m5sim.org Subject: [m5-dev] Functional Access support in Ruby Brad, Here is my understanding of the current state of functional accesses in gem5. As of now, all functional accesses are forwarded to the PhysicalMemory's MemoryPort. Instead, we would like to add recvFunctional() function to M5Port of the RubyPort, and attach this port as peer instead of the PhysicalMemory. -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Functional Access support in Ruby
Yes, that is correct. The RubyPort::M5Port::recvFunctional() function is where we need to add the new support. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Friday, February 25, 2011 12:20 PM To: m5-dev@m5sim.org Subject: [m5-dev] Functional Access support in Ruby Brad, Here is my understanding of the current state of functional accesses in gem5. As of now, all functional accesses are forwarded to the PhysicalMemory's MemoryPort. Instead, we would like to add recvFunctional() function to M5Port of the RubyPort, and attach this port as peer instead of the PhysicalMemory. -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
It sounds like we are in agreement here, but I just want to make sure we clarify one item. I don't believe simply checking the coherence permissions at commit time can sufficiently support stronger consistency models like SC/TSO. Instead you need to really need to know whether you've ever lost the block since the speculative instruction read it. Therefore, Ruby really does need to forward invalidations to the CPU. It sounded like from your responses that you understand that as well, but I just wanted to make the point clear. Brad From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Friday, February 25, 2011 10:29 AM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer This sounds right. Ruby does need to forward invalidations to the CPU since some models (including O3) will need to do internal invalidations/flushes to maintain consistency. Others can choose to do it other ways (e.g., by querying the L1 at commit as you suggest), but they have the option of ignoring the forwarded invalidations, so that's not a problem. Steve On Fri, Feb 25, 2011 at 9:07 AM, Arkaprava Basu aba...@wisc.edumailto:aba...@wisc.edu wrote: In sum, I think we all agree that Ruby is going to handle *only non-speculative stores*. M5 CPU model(s) handles all of speculative and non-speculative stores that are *yet to be revealed to the memory sub-system*. To make it clearer, as I understand, we now have following: 1. All store buffering (speculative and non-speculative) is handled by CPU model in M5. 2. Ruby needs to forward intervention/invalidation received at L1 cache controller to the CPU model to let it take appropriate action to guarantee required memory consistency guarantees (e.g t may need to flush pipeline). OR CPU models need to check coherence permission at L1 cache at the commit time to know if intervening writes has happened or not (might be required to implement stricter model like SC). I think we need to provide one of the functionality from Ruby side to allow the second condition above. Which one to provide depends upon what M5 CPU models wants to do to guarantee consistency. Please let me know if you disagree or if I am missing something. Thanks Arka On 02/24/2011 05:22 PM, Beckmann, Brad wrote: So I think Steve and I are in agreement here. We both agree that both speculative and non-speculative store buffers should be on the CPU side of the RubyPort interface. I believe that was the same line that existed when Ruby tied to Opal in GEMS. I believe the non-speculative store buffer was only a feature used when Opal was not attached, and it was just the simple SimicsProcessor driving Ruby. The sequencer is a separate issue. Certain functionality of the sequencer can probably be eliminated in gem5, but I think other functionality needs to remain or at least be moved to some other part of Ruby. The sequencer performs a lot of protocol independent functionality including: updating the actual data block, performing synchronization with respect to the cache memory, translating m5 packets to ruby requests, checking for per-cacheblock deadlock, and coalescing requests to the same cache block. The coalescing functionality can probably be eliminated, but I think the other functionality needs to remain. Brad From: m5-dev-boun...@m5sim.orgmailto:m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org ] On Behalf Of Steve Reinhardt Sent: Thursday, February 24, 2011 1:52 PM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish ni...@cs.wisc.edumailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edu wrote: On Thu, 24 Feb 2011, Beckmann, Brad wrote: Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. I don't know the O3 LSQ model very well, but I assume it buffers both speculative and non-speculative stores. Are there two different structures in Ruby for that? I think the general issue here is that the dividing line between processor and memory system is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request filtering, etc. all happens in the processor. For example, I know I've had you explain this to me multiple times already, but I still don't understand
Re: [m5-dev] Store Buffer
So we probably don't want to pass speculative store data to the RubyPort, but what about speculative load and store requests? I suspect we do want to send them to the RubyPort before the speculation is confirmed. That might require splitting stores to two separate transactions: the request and the actual data write. Also I suspect that the RubyPort will need to forward probes to the cpu models to allow the LSQ to maintain the proper consistency model. If those two things end up being true, then what is the benefit of putting the non-speculative store buffer in each protocol, versus just in the o3 cpu model? I'm not yet ready to advocate that is the right solution. I just want us to think these issues thru before deciding to go down one path or the other. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, February 24, 2011 10:45 AM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer On Thu, 24 Feb 2011, Arkaprava Basu wrote: Fundamentally, I wish to handle only non-speculative memory state within Ruby. Otherwise I think there might be risk of Ruby getting affected by the CPU model's behavior/nuances. As you suggested, Rubyport may well be the line dividing speculative and non-speculative state. I also agree that beyond RubyPort, all the stores should be non-speculative. I haven't looked at the Store buffer code in libruby and do not know how it interfaces with the protocols. So sorry, I don't have specific answers to your questions. I think Derek is the best person to comment on this as I believe he has used store buffer implementation for his prior research. I think currently the store buffer is not being used at all. I looked through GEMS code, and some of the protocols do declare a store buffer, but no one makes use of it. In gem5, store buffers are not included in the protocol files. In fact, current libruby code does nothing useful at all. I do think though, that the highest level (closest to the processor) cache controller (i.e. *-L1Cache.sm ) need to be made aware of the store buffer (unless it is hacked to bypass SLICC) . Thanks Arka -- Nilay On 02/23/2011 11:29 PM, Beckmann, Brad wrote: Sorry, I should have been more clear. It fundamentally comes down to how does the Ruby interface help support memory consistency, especially considering more realistic buffering between the CPU and memory system (both speculative and non-speculative). I'm pretty certain that Ruby and the RubyPort interface will need be changed. I just want us to fully understand the issues before making any changes or removing certain options. So are you advocating that the RubyPort interface be the line between speculative memory state and non-speculative memory state? As far as the current Ruby store buffer goes, how does it work with the L1 cache controller? For instance, if the L1 cache receives a probe/forwarded request to a block that exists in the non-speculative store buffer, what is the mechanism to retrieve the up-to-date data from the buffer entry? Is the mechanism protocol agnostic? Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. Overall, I guess I'm concluding that we probably can delete the current Ruby store buffer. Do others agree? Brad From: Steve Reinhardt [mailto:ste...@gmail.com] Sent: Thursday, February 24, 2011 11:20 AM To: M5 Developer List Cc: Beckmann, Brad Subject: Re: [m5-dev] Store Buffer On Thu, Feb 24, 2011 at 11:08 AM, Beckmann, Brad brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote: So we probably don't want to pass speculative store data to the RubyPort, but what about speculative load and store requests? I suspect we do want to send them to the RubyPort before the speculation is confirmed. That might require splitting stores to two separate transactions: the request and the actual data write. Also I suspect that the RubyPort will need to forward probes to the cpu models to allow the LSQ to maintain the proper consistency model. If those two things end up being true, then what is the benefit of putting the non-speculative store buffer in each protocol, versus just in the o3 cpu model? I'm not yet ready to advocate that is the right solution. I just want us to think these issues thru before deciding to go down one path or the other. I also support the concept of thinking things through, but I'm also happy to comment without having done that yet :-). My gut instinct is to say that O3 already has an LSQ, so Ruby needs to send invalidations up to the core to support the consistency model, and if we do that there's no need for a store buffer in Ruby. I'd like to better understand the arguments against that approach. For example, why would we want to send stores to Ruby when they are still speculative? Do we have real examples of systems that send the store address to the L1 cache speculatively? If we want to fetch store data more aggressively, wouldn't it be equivalent to generate a prefetch-with-write-intent first, then generate the store itself only when it commits? I think there are machines that do that. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
So I think Steve and I are in agreement here. We both agree that both speculative and non-speculative store buffers should be on the CPU side of the RubyPort interface. I believe that was the same line that existed when Ruby tied to Opal in GEMS. I believe the non-speculative store buffer was only a feature used when Opal was not attached, and it was just the simple SimicsProcessor driving Ruby. The sequencer is a separate issue. Certain functionality of the sequencer can probably be eliminated in gem5, but I think other functionality needs to remain or at least be moved to some other part of Ruby. The sequencer performs a lot of protocol independent functionality including: updating the actual data block, performing synchronization with respect to the cache memory, translating m5 packets to ruby requests, checking for per-cacheblock deadlock, and coalescing requests to the same cache block. The coalescing functionality can probably be eliminated, but I think the other functionality needs to remain. Brad From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Thursday, February 24, 2011 1:52 PM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish ni...@cs.wisc.edumailto:ni...@cs.wisc.edu wrote: On Thu, 24 Feb 2011, Beckmann, Brad wrote: Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. I don't know the O3 LSQ model very well, but I assume it buffers both speculative and non-speculative stores. Are there two different structures in Ruby for that? I think the general issue here is that the dividing line between processor and memory system is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request filtering, etc. all happens in the processor. For example, I know I've had you explain this to me multiple times already, but I still don't understand why we still need Ruby sequencers either :-). Brad, I raise the same point that Arka raised earlier. Other processor models can also make use of store buffer. So, why only O3 should have a store buffer? Nilay, I think that's a different issue... we're not saying that other CPU models can't have store buffers, but in practice, the simple CPU models block on memory accesses so they don't need one. If the inorder model wants to add a store buffer (if it doesn't already have one), it would be an internal decision for them whether they want to write one from scratch or try to reuse the O3 code. There are already some shared structures in src/cpu like branch predictors that can be reused across CPU models. So in other words we need to decide first where the store buffer should live (CPU or memory system) and then we can worry about how to reuse that code if that's useful. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] MOESI_CMP_directory-perfectDir.sm
Since I haven't heard any objections, I'm going to go ahead and remove it. Brad From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Beckmann, Brad Sent: Tuesday, February 22, 2011 2:37 PM To: Default (m5-dev@m5sim.org) Subject: [m5-dev] MOESI_CMP_directory-perfectDir.sm Hi All, I just posted a patch that removes all of the protocol files that are not supported in gem5. However, I'm not sure if anyone has used/is using the file MOESI_CMP_directory-perfectDir.sm. I've never used it before and I have no idea if it even works or what exactly it is supposed to do. Do people mind if I just remove it? I'll post the same question to the user list. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
That's a good question. Before we get rid of it, we should decide what is the interface between Ruby and the o3 LSQ. I don't know how the current o3 LSQ works, but I image that we need to pass probe requests through the RubyPort to make it work correctly. Does anyone with knowledge of the o3 LSQ have a suggestion? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Wednesday, February 23, 2011 4:51 PM To: m5-dev@m5sim.org Subject: [m5-dev] Store Buffer Brad, In case we remove libruby, what becomes of the store buffer? In fact, is store buffer in use? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] MOESI_CMP_directory-perfectDir.sm
Hi All, I just posted a patch that removes all of the protocol files that are not supported in gem5. However, I'm not sure if anyone has used/is using the file MOESI_CMP_directory-perfectDir.sm. I've never used it before and I have no idea if it even works or what exactly it is supposed to do. Do people mind if I just remove it? I'll post the same question to the user list. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] CacheController's wakeup function
Hi Nilay, I'm not quite sure what you mean by appended to while you drain, but I think you are asking whether the input ports will receive messages that are scheduled for the same cycle as the current cycle. Is that right? If so, then you are correct, that should not happen. As long as the input ports are evaluated in the current order of priority, you're change looks good to me. In the past, one could limit the loop iterations per cycle to approximate cache port contention. Therefore the higher priority ports must be listed first to avoid mandatory requests starving external responses. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, February 15, 2011 9:09 AM To: M5 Developer List Subject: Re: [m5-dev] CacheController's wakeup function On Tue, 15 Feb 2011, nathan binkert wrote: While I don't know anything about this code it looks a little suspect to me. Is there really a while (true) or is there some sort of while (!empty)? Can the queues be appended to while you drain? If these are both true, then you'll lose some of your enqueued messages. Sorry if I'm uninformed. It is a while(true), and there is break statement which is executed in case none of the queues have any messages. I am almost certain that the incoming queues do not get appended to while they are being drained, I would like Brad to confirm this. -- Nilay I thought of this a moment ago, so I have not confirmed this empirically. The CacheController's wakeup function includes a while loop, in which all the queues are checked. Consider the Hammer protocol's L1 Cache Controller. It has four incoming queues - trigger, response, forward, mandatory. The wakeup function looks like this -- while(true) { process trigger queue; process response queue; process forward queue; process mandatory queue; } where process means processing a single message from the queue. I expect most of the messages to be present in the mandatory queue which processes the actually loads and stores issued by the associated processor. Would the following be better -- while(true) process trigger queue; while(true) process response queue; while(true) process forward queue; while(true) process mandatory queue; I do not expect any improvement in case of FS profiling as most of the times, the mandatory queue has only one single message. But for testing protocols using ruby random tester, I do expect some improvement. In FS profile, after the histogram function (which takes about 8% time), the wakeup function's execution time is the highest (about 5%). For ruby random tester profile, the wakeup function takes about 11% of the time. -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] MOESI Hammer Protocol Deadlock
Hi Nilay, Thanks for the heads up. I looked into it and there is a simple fix. I'm pushing the fix momentarily. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, February 10, 2011 5:40 AM To: m5-dev@m5sim.org Subject: [m5-dev] MOESI Hammer Protocol Deadlock Hi Brad, I think MOESI hammer protocol has a deadlock scenario. Try the following - hg update -r 7922 scons USE_MYSQL=False RUBY=True CC=gcc44 CXX=g++44 NO_HTML=True --no-colors build/ALPHA_SE_MOESI_hammer/m5.fast ./build/ALPHA_SE_MOESI_hammer/m5.fast ./configs/example/ruby_random_test.py -n 4 -l 200 -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Ruby FS Fails with recent Changesets
H Malek, Hmm...I have never seen that type of error before. As you mentioned, I don't think any of my recent patches changed how DMA is executed for ALPHA_FS. How long does it take for you to encounter the error? It would be great if you could tell me how I can reproduce the error. I would like to look at this in more detail and get a protocol trace of what is going on. Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Malek Musleh Sent: Thursday, February 10, 2011 5:05 AM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets Hi Brad, I tested your latest changeset, and it seems that it 'solves' the handleResponse error I was getting when running 3 or more cores, but the dma_expiry error is still there. Such that, now the error is consistent, no matter what number of cores I try to run with: For more information see: http://www.m5sim.org/warn/3e0eccba panic: Inconsistent DMA transfer state: dmaState = 2 devState = 1 @ cycle 62411238889001 [doDmaTransfer:build/ALPHA_FS_MOESI_CMP_directory/dev/ide_disk.cc, line 323] Memory Usage: 382600 KBytes - M5 Terminal --- hda: max request size: 128KiB hda: 101808 sectors (52 MB), CHS=101/16/63 hda:4hda: dma_timer_expiry: dma status == 0x65 hda: DMA interrupt recovery hda: lost interrupt unknown partition table hdb: max request size: 128KiB hdb: 4177920 sectors (2139 MB), CHS=4144/16/63 hdb:4hdb: dma_timer_expiry: dma status == 0x65 hdb: DMA interrupt recovery hdb: lost interrupt The panic error seems to suggest an inconsistent DMA state, so I tried reverting to an older changeset (before DMA changes were pushed out) such as 7936, and even 7930 but no such luck. The changeset that I know works from last week or so is changeset 7842. Looking at the changset summaries between 7842 and 7930 seem to indicate a lot of changes 'unrelated' to the DMA, such as O3, InOrderCPU, and x86 changes. That being said, I did not do a diff on those intermediate changesets to verify that maybe a related file was slightly modified in the process. I might be able to spend some more time trying changesets till I narrow down which one its coming from, but maybe the new panic message might give you some indication on how to fix it? (I think the panic messaged appeared now and not before because I let the simulation terminate itself when running overnight as opposed to me killing it once I saw the dma_expiry message on the M5 Terminal). Malek On Wed, Feb 9, 2011 at 7:00 PM, Beckmann, Brad brad.beckm...@amd.com wrote: Hi Malek, Yes, thanks for letting us know. I'm pretty sure I know what the problem is. Previously, if a SC operation failed, the RubyPort would convert the request packet to a response packet, bypassed writing the functional view of memory, and pass it back up to the CPU. In my most recent patches I generalized the mechanism that converts request packets to response packets and avoids writing functional memory. However, I forgot to remove the duplicate request to response conversion for failed SC requests. Therefore, I bet you are encounter that assertion error on that duplicate call. It should be a simple one line change that fixes your problem. I'll push it momentarily and it would be great if you could confirm that my change does indeed fix your problem. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Gabe Black Sent: Wednesday, February 09, 2011 3:54 PM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets Thanks for letting us know. If it wouldn't be too much trouble, could you please try some other changesets near the one that isn't working and try to determine which one specifically broke things? A bunch of changes went in recently so it would be helpful to narrow things down. I'm not very involved with Ruby right now personally, but I assume that would be useful information for the people that are. Gabe On 02/09/11 14:51, Malek Musleh wrote: Hello, I first started using the Ruby Model in M5 about a week or so ago, and was able to boot in FS mode (up to 64 cores once applying the BigTsunami patches). In order to keep up with the changes in the Ruby code, I have started fetching recent updates from the devrepo. However, in fetching the updates to the recent changesets (from the last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory and MOESI_CMP_directory. If running 2 cores or less I get this at the terminal screen after letting it run for some time: hda: M5 IDE Disk, ATA DISK drive hdb: M5 IDE Disk, ATA DISK drive hda: UDMA/33 mode selected hdb: UDMA/33 mode selected ide0 at 0x8410-0x8417,0x8422 on irq 31 ide1 at 0x8418
Re: [m5-dev] Ruby FS Fails with recent Changesets
: listening for remote gdb #2 on port 7002 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003 REAL SIMULATION info: Entering event queue @ 0. Starting simulation... info: Launching CPU 1 @ 835461000 info: Launching CPU 2 @ 846156000 info: Launching CPU 3 @ 856768000 warn: Prefetch instrutions is Alpha do not do anything For more information see: http://www.m5sim.org/warn/3e0eccba 1349195500: system.terminal: attach terminal 0 warn: Prefetch instrutions is Alpha do not do anything For more information see: http://www.m5sim.org/warn/3e0eccba m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/ruby/system/RubyPort.cc:23 0: virt\ ual bool RubyPort::M5Port::recvTiming(Packet*): Assertion `Address(ruby_request.\ paddr).getOffset() + ruby_request.len = RubySystem::getBlockSizeBytes()' failed\ . Program aborted at cycle 2406378289516 Aborted The same error occurs for 7907 - 7908. At changeset 7909 is where the dma_expiry error first shows up: 7909: hda: M5 IDE Disk, ATA DISK drive hdb: M5 IDE Disk, ATA DISK drive hda: UDMA/33 mode selected hdb: UDMA/33 mode selected ide0 at 0x8410-0x8417,0x8422 on irq 31 ide1 at 0x8418-0x841f,0x8426 on irq 31 ide_generic: please use probe_mask=0x3f module parameter for probing all legac\ y ISA IDE ports ide2 at 0x1f0-0x1f7,0x3f6 on irq 14 ide3 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: 101808 sectors (52 MB), CHS=101/16/63 hda:4hda: dma_timer_expiry: dma status == 0x65 hda: DMA interrupt recovery hda: lost interrupt unknown partition table hdb: max request size: 128KiB hdb: 4177920 sectors (2139 MB), CHS=4144/16/63 I tested changeset 7920: and thats where I notice the handleResponse() 7920: M5 compiled Feb 10 2011 14:49:49 M5 revision 39c86a8306d2+ 7920+ default M5 started Feb 10 2011 14:53:38 M5 executing on sherpa05 command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt ./configs/example/ruby\ _fs.py -n 4 --topology Crossbar Global frequency set at 1 ticks per second info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux Listening for system connection on port 3456 0: system.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003 REAL SIMULATION info: Entering event queue @ 0. Starting simulation... info: Launching CPU 1 @ 835461000 info: Launching CPU 2 @ 846156000 info: Launching CPU 3 @ 856768000 warn: Prefetch instrutions is Alpha do not do anything For more information see: http://www.m5sim.org/warn/3e0eccba 1128875500: system.terminal: attach terminal 0 warn: Prefetch instrutions is Alpha do not do anything For more information see: http://www.m5sim.org/warn/3e0eccba m5.opt: build/ALPHA_FS_MOESI_CMP_directory/mem/packet.hh:590: void Packet::makeResponse(): Assertion `needsResponse()' failed. Program aborted at cycle 36235566500 Aborted Note that I have not tested changesets 7911-7918. I have tested the MOESI_CMP_directory protocol on all of these with m5.opt. I have testes using MESI_CMP_directory for some of them and got the same messages. This is my command line: ./build/ALPHA_FS_MOESI_CMP_directory/m5.opt - ./configs/example/ruby_fs.py -n 4 --topology Crossbar The error comes at about 15 minutes in to boot the kernel. Note that it takes a while for the io to be scheduled. io scheduler noop registered io scheduler anticipatory registered io scheduler deadline registered io scheduler cfq registered (default) In all cases though where the dma_expiry occurs (which does not include changesets 7906-7908), the last thing that appears is this: ide0 at 0x8410-0x8417,0x8422 on irq 31 ide1 at 0x8418-0x841f,0x8426 on irq 31 ide_generic: please use probe_mask=0x3f module parameter for probing all legacy ISA IDE ports ide2 at 0x1f0-0x1f7,0x3f6 on irq 14 ide3 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: 101808 sectors (52 MB), CHS=101/16/63 hda:4hda: dma_timer_expiry: dma status == 0x65 hda: DMA interrupt recovery hda: lost interrupt unknown partition table hdb: max request size: 128KiB hdb: 4177920 sectors (2139 MB), CHS=4144/16/63 Is it possible to generate a trace for Ruby in M5 the way it is for Ruby in GEMS like something of this sort: http://www.cs.wisc.edu/gems/doc/gems- wiki/moin.cgi/How_do_I_understan d_a_Protocol ? Let me know if you need anymore information. Malek On Thu, Feb 10, 2011 at 4:43 PM, Beckmann, Brad brad.beckm...@amd.com wrote: H Malek, Hmm...I have never seen that type
Re: [m5-dev] Ruby FS Fails with recent Changesets
Hi Malek, Yes, thanks for letting us know. I'm pretty sure I know what the problem is. Previously, if a SC operation failed, the RubyPort would convert the request packet to a response packet, bypassed writing the functional view of memory, and pass it back up to the CPU. In my most recent patches I generalized the mechanism that converts request packets to response packets and avoids writing functional memory. However, I forgot to remove the duplicate request to response conversion for failed SC requests. Therefore, I bet you are encounter that assertion error on that duplicate call. It should be a simple one line change that fixes your problem. I'll push it momentarily and it would be great if you could confirm that my change does indeed fix your problem. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Gabe Black Sent: Wednesday, February 09, 2011 3:54 PM To: M5 Developer List Subject: Re: [m5-dev] Ruby FS Fails with recent Changesets Thanks for letting us know. If it wouldn't be too much trouble, could you please try some other changesets near the one that isn't working and try to determine which one specifically broke things? A bunch of changes went in recently so it would be helpful to narrow things down. I'm not very involved with Ruby right now personally, but I assume that would be useful information for the people that are. Gabe On 02/09/11 14:51, Malek Musleh wrote: Hello, I first started using the Ruby Model in M5 about a week or so ago, and was able to boot in FS mode (up to 64 cores once applying the BigTsunami patches). In order to keep up with the changes in the Ruby code, I have started fetching recent updates from the devrepo. However, in fetching the updates to the recent changesets (from the last 2 days) Ruby FS does not boot. I tried both MESI_CMP_directory and MOESI_CMP_directory. If running 2 cores or less I get this at the terminal screen after letting it run for some time: hda: M5 IDE Disk, ATA DISK drive hdb: M5 IDE Disk, ATA DISK drive hda: UDMA/33 mode selected hdb: UDMA/33 mode selected ide0 at 0x8410-0x8417,0x8422 on irq 31 ide1 at 0x8418-0x841f,0x8426 on irq 31 ide_generic: please use probe_mask=0x3f module parameter for probing all legacy ISA IDE ports ide2 at 0x1f0-0x1f7,0x3f6 on irq 14 ide3 at 0x170-0x177,0x376 on irq 15 hda: max request size: 128KiB hda: 101808 sectors (52 MB), CHS=101/16/63 hda:4hda: dma_timer_expiry: dma status == 0x65 --- problem When running 3 or more cores, I get the following assertion failure: info: kernel located at: /home/musleh/M5/m5_system_2.0b3/binaries/vmlinux Listening for system connection on port 3456 0: system.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000 0: system.remote_gdb.listener: listening for remote gdb #1 on port 7001 0: system.remote_gdb.listener: listening for remote gdb #2 on port 7002 0: system.remote_gdb.listener: listening for remote gdb #3 on port 7003 REAL SIMULATION info: Entering event queue @ 0. Starting simulation... info: Launching CPU 1 @ 834794000 info: Launching CPU 2 @ 845489000 info: Launching CPU 3 @ 856101000 m5.opt: build/ALPHA_FS_MESI_CMP_directory/mem/packet.hh:590: void Packet::makeResponse(): Assertion `needsResponse()' failed. Program aborted at cycle 97716 Aborted The top of the tree is this last changeset: changeset: 7939:215c8be67063 tag: tip user:Brad Beckmann brad.beckm...@amd.com date:Tue Feb 08 18:07:54 2011 -0800 summary: regess: protocol regression tester updates I am not sure if those whom it concern are aware of it or not, or if there will be a soon to be updated changeset already in the works for this or not, but I figured I would bring it to your attention. Malek ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do-regression quick
Hi Gabe, Since you successfully updated the tests I can't run (ARM_FS), I can take of the remaining errors (i.e. ruby protocol tests). I have a few minor fixes I want to check in that I need to run the regression tester against anyways. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Gabe Black Sent: Tuesday, February 08, 2011 12:15 AM To: M5 Developer List Subject: Re: [m5-dev] Cron m5test@zizzer /z/m5/regression/do- regression quick Hmm. I didn't realize all the build targets for ruby protocols had their own separate regressions. I'll have to run those too. Gabe On 02/08/11 00:17, Cron Daemon wrote: * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru6 4/simple-timing-ruby-MESI_CMP_directory FAILED! * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/00.hello/alpha/linux /simple-timing-ruby-MESI_CMP_directory FAILED! * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/60.rubytest/alpha/linux/ rubytest-ruby-MOESI_hammer FAILED! * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/50.memtest/alpha/l inux/memtest-ruby-MESI_CMP_directory FAILED! * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/tru64/sim ple-timing-ruby-MOESI_hammer FAILED! * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/00.hello/alpha/linux/sim ple-timing-ruby-MOESI_hammer FAILED! * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/lin ux/simple-timing-ruby-MOESI_CMP_directory FAILED! * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/00.hello/alpha/tru 64/simple-timing-ruby-MOESI_CMP_directory FAILED! * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/60.rubytest/alpha/lin ux/rubytest-ruby-MOESI_CMP_token FAILED! * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/tru64/ simple-timing-ruby-MOESI_CMP_token FAILED! * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/00.hello/alpha/linux/ simple-timing-ruby-MOESI_CMP_token FAILED! * build/ALPHA_SE_MOESI_hammer/tests/fast/quick/50.memtest/alpha/linux /memtest-ruby-MOESI_hammer FAILED! * build/ALPHA_SE_MOESI_CMP_token/tests/fast/quick/50.memtest/alpha/li nux/memtest-ruby-MOESI_CMP_token FAILED! * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/50.memtest/alpha /linux/memtest-ruby-MOESI_CMP_directory FAILED! scons: *** Source `tests/quick/01.hello-2T-smt/ref/alpha/linux/o3- timing/stats.txt' not found, needed by target `build/ALPHA_SE/tests/fast/quick/01.hello-2T-smt/alpha/linux/o3- timing/status'. * build/ALPHA_SE/tests/fast/quick/60.rubytest/alpha/linux/rubytest- ruby passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple- timing-ruby passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple- atomic-mp passed. * build/ALPHA_SE/tests/fast/quick/30.eio-mp/alpha/eio/simple- timing-mp passed. * build/ALPHA_SE_MESI_CMP_directory/tests/fast/quick/60.rubytest/alpha/li nux/rubytest-ruby-MESI_CMP_directory passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple- atomic passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple- timing passed. * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple- atomic passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple- atomic passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/tru64/simple- timing-ruby passed. * build/ALPHA_SE/tests/fast/quick/20.eio-short/alpha/eio/simple- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/simple- timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/o3-timing passed. * build/ALPHA_SE/tests/fast/quick/00.hello/alpha/linux/inorder- timing passed. * build/ALPHA_SE_MOESI_CMP_directory/tests/fast/quick/60.rubytest/alpha /linux/rubytest-ruby-MOESI_CMP_directory passed. * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest-ruby passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-timing passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-timing-dual passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-atomic-dual passed. * build/ALPHA_FS/tests/fast/quick/80.netperf- stream/alpha/linux/twosys-tsunami-simple-atomic passed. * build/ALPHA_FS/tests/fast/quick/10.linux- boot/alpha/linux/tsunami-simple-atomic passed. * build/ALPHA_SE/tests/fast/quick/50.memtest/alpha/linux/memtest passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-atomic passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing- ruby passed. * build/MIPS_SE/tests/fast/quick/00.hello/mips/linux/simple-timing
Re: [m5-dev] Missing _ in ruby_fs.py
Ah, yes I did. This actually reminds me that I need to fix how dma devices are connected within Ruby for x86_FS. I'll push a batch that fixes these issues soon. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, February 08, 2011 9:54 AM To: m5-dev@m5sim.org Subject: [m5-dev] Missing _ in ruby_fs.py Hi Brad, did you miss out on the '_' in _dma_devices? -- Nilay diff -r 6f5299ff8260 -r 00ad807ed2ca configs/example/ruby_fs.py --- a/configs/example/ruby_fs.pySun Feb 06 22:14:18 2011 -0800 +++ b/configs/example/ruby_fs.pySun Feb 06 22:14:18 2011 -0800 @@ -109,12 +109,19 @@ CPUClass.clock = options.clock -system = makeLinuxAlphaRubySystem(test_mem_mode, bm[0]) - -system.ruby = Ruby.create_system(options, - system, - system.piobus, - system._dma_devices) +if buildEnv['TARGET_ISA'] == alpha: +system = makeLinuxAlphaRubySystem(test_mem_mode, bm[0]) +system.ruby = Ruby.create_system(options, + system, + system.piobus, + system.dma_devices) elif +buildEnv['TARGET_ISA'] == x86: +system = makeLinuxX86System(test_mem_mode, options.num_cpus, bm[0], True) +system.ruby = Ruby.create_system(options, + system, + system.piobus) +else: +fatal(incapable of building non-alpha or non-x86 full system!) system.cpu = [CPUClass(cpu_id=i) for i in xrange(options.num_cpus)] ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol
Hi Korey, Just to clarify, the deadlock threshold in the sequencer is different than the deadlock threshold in the mem tester. The sequencer's deadlock mechanism detects whether any particular request takes longer than the threshold. Meanwhile the mem tester deadlock threshold just ensures that a particular cpu sees at least one request complete within the deadlock threshold. I don't think we want to degrade the deadlock checker to just a warning. While in this particular case, the deadlock turned out to be just a performance issue, in my experience the vast majority of potential deadlock detections turn out to be real bugs. Later today I'll check in patch that increases the ruby mem test deadlock threshold. Brad From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Korey Sewell Sent: Monday, February 07, 2011 2:27 PM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol Another followup on this is that the deadlock_threshold parameter doesnt propagate to the MemTester CPU. So when I'm testing 64 CPUS, the memtester.cc still has this code: if (!tickEvent.scheduled()) schedule(tickEvent, curTick() + ticks(1)); if (++noResponseCycles = 50) { if (issueDmas) { cerr DMA tester ; } cerr name() : deadlocked at cycle curTick() endl; fatal(); } That hardcoded 50 is not a great number (as people have said) because as your topologies/mem. hierarchies change, then the max # of cycles that you have to wait for a response can also change, right? Increasing that # by hand is a arduous thing to do, so maybe that # should come off a parameter, as well as maybe we should warn there that a deadlock is possible after some type of inordinate wait time. The fix should be just to warn about a long wait after an inordinate period...Something like this I think: if (++noResponseCycles % 50 == 0) { warn(cpu X has waited for %i cycles, noResponseCycles); } Lastly, should the memtester really send out a memory access on every tick? The actual injection rate could be much higher than the rate at which we resolve contention. Maybe we should consider having X many outstanding requests per CPU as a more realistic measure that can stress the system but not make the noResponseCycles stat (?) grow to such an high number.. On Mon, Feb 7, 2011 at 1:27 PM, Beckmann, Brad brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote: Yep, if I increase the deadlock threshold to 5 million cycles, the deadlock warning is not encountered. However, I don't think that we should increase the default deadlock threshold to by an order-of-magnitude. Instead, let's just increase the threashold for the mem tester. How about I check in the following small patch. Brad diff --git a/configs/example/ruby_mem_test.py b/configs/example/ruby_mem_test.py --- a/configs/example/ruby_mem_test.py +++ b/configs/example/ruby_mem_test.py @@ -135,6 +135,12 @@ cpu.test = system.ruby.cpu_ruby_ports[i].port cpu.functional = system.funcmem.port +# +# Since the memtester is incredibly bursty, increase the deadlock +# threshold to 5 million cycles +# +system.ruby.cpu_ruby_ports[i].deadlock_threshold = 500 + for (i, dma) in enumerate(dmas): # # Tie the dma memtester ports to the correct functional port diff --git a/tests/configs/memtest-ruby.py b/tests/configs/memtest-ruby.py --- a/tests/configs/memtest-ruby.py +++ b/tests/configs/memtest-ruby.py @@ -96,6 +96,12 @@ # cpus[i].test = ruby_port.port cpus[i].functional = system.funcmem.port + + # + # Since the memtester is incredibly bursty, increase the deadlock + # threshold to 5 million cycles + # + ruby_port.deadlock_threshold = 500 # --- # run simulation -Original Message- From: m5-dev-boun...@m5sim.orgmailto:m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.orgmailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Monday, February 07, 2011 9:12 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol Brad, I also see the protocol getting into a dead lock. I tried to get a trace, but I get segmentation fault (yes, the segmentation fault only occurs when trace flag ProtocolTrace is supplied). It seems to me that memory is getting corrupted somewhere, because the fault occurs in malloc it self. It could be that protocol is actually not in a dead lock. Both Arka and I had increased the deadlock threashold while testing the protocol. I will try with increased threashold later in the day. One more thing, the Orion 2.0 code that was committed last night makes use of printf(). It did not compile cleanly for me. I had change it fatal() and include the header file base/misc.hh. -- Nilay On Mon, 7 Feb 2011, Beckmann, Brad wrote: FYI
Re: [m5-dev] changeset in m5: regress: Regression Tester output updates
Ugh...sorry about that. I had to update most of the stats because one of Joel's patches added several new stats. The problem was that I don't have the Linux kernel to run the ARM FS regression tests. Therefore those tests didn't run correctly and thus I incorrectly updated those regression output files. A similar problem occurred for the X86_SE o3 test. There is no excuse for my incorrect update of these regression output files. However, one thing that will help me in the future is making sure that all of us have the capability to run all regress tests. Many of us, including myself, don't have log in access to zizzer at Michigan, and thus it is very hard for me to reproduce the environment on zizzer, including external file dependencies. Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Gabe Black Sent: Monday, February 07, 2011 12:47 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: regress: Regression Tester output updates I'm rolling back this stat update and rerunning/reupdating. It's going to take a while, but I'll push once it's done. Gabe On 02/06/11 23:03, Ali Saidi wrote: This seems like a really half baked attempt to update the stats. You've removed all the ARM FS stats, and you're only seem to have updated a few of the quick tests, however all of the other tests aren't updated. Since you added some stats every test will need an update. Ali On Feb 7, 2011, at 12:17 AM, Brad Beckmann wrote: changeset 05f52a716144 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=05f52a716144 description: regress: Regression Tester output updates diffstat: tests/quick/00.hello/ref/alpha/linux/inorder-timing/config.ini |13 +- tests/quick/00.hello/ref/alpha/linux/inorder-timing/simout | 8 +- tests/quick/00.hello/ref/alpha/linux/inorder-timing/stats.txt |10 +- tests/quick/00.hello/ref/alpha/linux/o3-timing/config.ini | 13 +- tests/quick/00.hello/ref/alpha/linux/o3-timing/simout | 8 +- tests/quick/00.hello/ref/alpha/linux/o3-timing/stats.txt | 31 +- tests/quick/00.hello/ref/alpha/linux/simple-atomic/config.ini |11 +- tests/quick/00.hello/ref/alpha/linux/simple-atomic/simout | 8 +- tests/quick/00.hello/ref/alpha/linux/simple-atomic/stats.txt |24 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MESI_CMP_directory/config.ini |14 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MESI_CMP_directory/ruby.stats |30 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MESI_CMP_directory/simout | 8 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MESI_CMP_directory/stats.txt |26 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_CMP_directory/config.ini |68 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_CMP_directory/ruby.stats |54 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_CMP_directory/simout | 8 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_CMP_directory/stats.txt |26 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_CMP_token/config.ini |68 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_CMP_token/ruby.stats |94 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_CMP_token/simout | 8 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_CMP_token/stats.txt |26 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_hammer/config.ini|97 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_hammer/ruby.stats| 164 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_hammer/simout|10 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby- MOESI_hammer/stats.txt |30 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby/config.ini | 228 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby/ruby.stats | 282 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby/simout | 8 +- tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby/stats.txt |26 +- tests/quick/00.hello/ref/alpha/linux/simple-timing/config.ini |13 +- tests/quick/00.hello/ref/alpha/linux/simple-timing/simout |12 +- tests/quick/00.hello/ref/alpha/linux/simple-timing/stats.txt |26 +- tests/quick/00.hello/ref/alpha/tru64/o3-timing/config.ini | 13 +- tests/quick/00.hello/ref/alpha/tru64/o3-timing/simout | 8 +- tests/quick/00.hello/ref/alpha/tru64/o3-timing/stats.txt | 30 +-
Re: [m5-dev] changeset in m5: ruby: add stdio header in SRAM.hh
I agree Nilay. Do you want to push that patch, or would you like me to take care of it? Ideally Tushar should do it, but since he's in Singapore it is probably best that you or I do it. Thanks for pointing that out. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Monday, February 07, 2011 9:23 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: add stdio header in SRAM.hh Korey, I think the printf statements should be replaced with fatal() or panic() instead. -- Nilay On Mon, 7 Feb 2011, Korey Sewell wrote: changeset 5f2a2deb377d in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=5f2a2deb377d description: ruby: add stdio header in SRAM.hh missing header file caused RUBY_FS to not compile diffstat: src/mem/ruby/network/orion/Buffer/SRAM.hh | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diffs (11 lines): diff -r 2c2dc567a450 -r 5f2a2deb377d src/mem/ruby/network/orion/Buffer/SRAM.hh --- a/src/mem/ruby/network/orion/Buffer/SRAM.hh Mon Feb 07 01:23:16 2011 -0800 +++ b/src/mem/ruby/network/orion/Buffer/SRAM.hh Mon Feb 07 12:19:46 2011 -0500 @@ -39,6 +39,7 @@ #include mem/ruby/network/orion/Type.hh #include mem/ruby/network/orion/OrionConfig.hh #include mem/ruby/network/orion/TechParameter.hh +#include stdio.h class OutdrvUnit; class AmpUnit; ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: ruby: add stdio header in SRAM.hh
Hi Nilay, I assume the printf's that give you problems are the five listed below. Based on my little understanding of orion, I believe you can reach those errors if you misconfigure the buffer. Therefore I do think that fatal is the correct call. Brad src/mem/ruby/network/orion/Buffer/BitlineUnit.cc:printf(error\n); src/mem/ruby/network/orion/Buffer/OutdrvUnit.cc:printf(error\n); src/mem/ruby/network/orion/Buffer/PrechargeUnit.cc:default: printf(error\n); return 0; src/mem/ruby/network/orion/Buffer/PrechargeUnit.cc:default: printf(error\n); return 0; src/mem/ruby/network/orion/Buffer/WordlineUnit.cc:printf(error\n); -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Monday, February 07, 2011 9:35 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: add stdio header in SRAM.hh I can do it. I have replaced all of the printf()s with fatal()s. Is this correct, or should I use panic() instead? -- Nilay On Mon, 7 Feb 2011, Beckmann, Brad wrote: I agree Nilay. Do you want to push that patch, or would you like me to take care of it? Ideally Tushar should do it, but since he's in Singapore it is probably best that you or I do it. Thanks for pointing that out. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Monday, February 07, 2011 9:23 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: ruby: add stdio header in SRAM.hh Korey, I think the printf statements should be replaced with fatal() or panic() instead. -- Nilay On Mon, 7 Feb 2011, Korey Sewell wrote: changeset 5f2a2deb377d in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=5f2a2deb377d description: ruby: add stdio header in SRAM.hh missing header file caused RUBY_FS to not compile diffstat: src/mem/ruby/network/orion/Buffer/SRAM.hh | 1 + 1 files changed, 1 insertions(+), 0 deletions(-) diffs (11 lines): diff -r 2c2dc567a450 -r 5f2a2deb377d src/mem/ruby/network/orion/Buffer/SRAM.hh --- a/src/mem/ruby/network/orion/Buffer/SRAM.hh Mon Feb 07 01:23:16 2011 -0800 +++ b/src/mem/ruby/network/orion/Buffer/SRAM.hh Mon Feb 07 12:19:46 2011 -0500 @@ -39,6 +39,7 @@ #include mem/ruby/network/orion/Type.hh #include mem/ruby/network/orion/OrionConfig.hh #include mem/ruby/network/orion/TechParameter.hh +#include stdio.h class OutdrvUnit; class AmpUnit; ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol
Yep, if I increase the deadlock threshold to 5 million cycles, the deadlock warning is not encountered. However, I don't think that we should increase the default deadlock threshold to by an order-of-magnitude. Instead, let's just increase the threashold for the mem tester. How about I check in the following small patch. Brad diff --git a/configs/example/ruby_mem_test.py b/configs/example/ruby_mem_test.py --- a/configs/example/ruby_mem_test.py +++ b/configs/example/ruby_mem_test.py @@ -135,6 +135,12 @@ cpu.test = system.ruby.cpu_ruby_ports[i].port cpu.functional = system.funcmem.port +# +# Since the memtester is incredibly bursty, increase the deadlock +# threshold to 5 million cycles +# +system.ruby.cpu_ruby_ports[i].deadlock_threshold = 500 + for (i, dma) in enumerate(dmas): # # Tie the dma memtester ports to the correct functional port diff --git a/tests/configs/memtest-ruby.py b/tests/configs/memtest-ruby.py --- a/tests/configs/memtest-ruby.py +++ b/tests/configs/memtest-ruby.py @@ -96,6 +96,12 @@ # cpus[i].test = ruby_port.port cpus[i].functional = system.funcmem.port + + # + # Since the memtester is incredibly bursty, increase the deadlock + # threshold to 5 million cycles + # + ruby_port.deadlock_threshold = 500 # --- # run simulation -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Monday, February 07, 2011 9:12 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol Brad, I also see the protocol getting into a dead lock. I tried to get a trace, but I get segmentation fault (yes, the segmentation fault only occurs when trace flag ProtocolTrace is supplied). It seems to me that memory is getting corrupted somewhere, because the fault occurs in malloc it self. It could be that protocol is actually not in a dead lock. Both Arka and I had increased the deadlock threashold while testing the protocol. I will try with increased threashold later in the day. One more thing, the Orion 2.0 code that was committed last night makes use of printf(). It did not compile cleanly for me. I had change it fatal() and include the header file base/misc.hh. -- Nilay On Mon, 7 Feb 2011, Beckmann, Brad wrote: FYI...If my local regression tests are correct. This patch does not fix all the problems with the MESI_CMP_directory protocol. One of the patches I just checked in fixes a subtle bug in the ruby_mem_test. Fixing this bug, exposes more deadlock problems in the MESI_CMP_directory protocol. To reproduce the regression tester's sequencer deadlock error, set the Randomization flag to false in the file configs/example/ruby_mem_test.py then run the following command: build/ALPHA_SE_MESI_CMP_directory/m5.debug configs/example/ruby_mem_test.py -n 8 Let me know if you have any questions, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, January 13, 2011 8:50 PM To: m5-dev@m5sim.org Subject: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol changeset 8f37a23e02d7 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=8f37a23e02d7 description: Ruby: Fixes MESI CMP directory protocol The current implementation of MESI CMP directory protocol is broken. This patch, from Arkaprava Basu, fixes the protocol. diffstat: ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: regress: Regression Tester output updates
Hi Gabe, Yes, the set of patches I checked in require are a lot of changes to the output files. I scanned parts of the regression tester patch last night and noticed those changes as well, including the 5x change (actually it is more like 10x in most cases) in simticks for the mem tester. They all make sense to me. There are multiple patches that impact the regression tester output. The McPAT cpu counter and work unit patches added several new variables to every stats.txt file. That is the major source of changes in those files you listed below (minus the memtester). The large difference in the memtester is something else. Another one of my patches fixed a problem with respect to what block size ruby indicated to the cpus. By fixing this problem, it exposed the fact that ruby did not support the retry semantics expected by the cpu models. I thus added that support, which then fixed a major problem in the memtester. Interestingly enough, indicating a block size of 0 to the memtester caused the memtester to issue only one request at a time per cpu. Now the memtester issues as many requests as possible to ruby until the sequencer's outstanding request count is reached (16 by default). The significantly higher contention is the reason why the memtester simticks increase by 5-10x. Overall, I am aware of the many changes I made to the regression tester output last night. The problem was that my changes were so significant that I failed to realize that I also slipped in completely removing the regression tester output files for the ARM_FS and x86 o3 timing tests. I also noticed last night that I couldn't successfully run the ARM_FS and x86 o3 timing tests locally, but I figured that was ok since those failures were due to environment issues. What I failed to do is put the two things together, and realize that I can't update regression tester output if I can't successfully run those tests. Oh well, live and learn. Thanks Gabe for rerunning those tests and updating the regression tester output! Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Gabe Black Sent: Monday, February 07, 2011 1:59 PM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: regress: Regression Tester output updates Yeah, unfortunately some of those files we can't distribute, but I'm pretty sure the ARM Linux kernel we can. As we discussed before it would be ideal to move away from the regressions that need files to run that we can't actually give people, but that's likely going to be a lot of work. In any case, the regressions reran and I have an update. I went through all the diffs and saw lots of what I expected (new stats, different host stats, minor config.ini changes, different paths to things) but I also saw a few regressions with unexpected (by me) differences. These were: tests/quick/00.hello/ref/alpha/linux/simple-timing-ruby tests/quick/00.hello/ref/alpha/tru64/simple-timing-ruby tests/quick/00.hello/ref/mips/linux/simple-timing-ruby tests/quick/00.hello/ref/sparc/linux/simple-timing-ruby tests/quick/00.hello/ref/x86/linux/simple-timing-ruby tests/quick/50.memtest/ref/alpha/linux/memtest-ruby tests/quick/60.rubytest/ref/alpha/linux/rubytest-ruby The memtest-ruby regression seems to be the most significantly affected where the number of ticks is increased by a factor of about 5. The patch I made is attached in case anybody wants to go through it. I'd suggest that at least a somebody that's familiar with Ruby go through the tests I pointed out and verify the changes are what they expected. Once one of the Ruby folks (Brad maybe?) lets me know everything is on track and nobody has asked otherwise, I'll go ahead and commit this. Gabe On 02/07/11 09:15, Beckmann, Brad wrote: Ugh...sorry about that. I had to update most of the stats because one of Joel's patches added several new stats. The problem was that I don't have the Linux kernel to run the ARM FS regression tests. Therefore those tests didn't run correctly and thus I incorrectly updated those regression output files. A similar problem occurred for the X86_SE o3 test. There is no excuse for my incorrect update of these regression output files. However, one thing that will help me in the future is making sure that all of us have the capability to run all regress tests. Many of us, including myself, don't have log in access to zizzer at Michigan, and thus it is very hard for me to reproduce the environment on zizzer, including external file dependencies. Thanks, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Gabe Black Sent: Monday, February 07, 2011 12:47 AM To: M5 Developer List Subject: Re: [m5-dev] changeset in m5: regress: Regression Tester output updates I'm rolling back this stat update and rerunning/reupdating
Re: [m5-dev] changeset in m5: scons: show sources and targets when building, ...
Do people mind if I change the source and target color from Yellow to Green? I typically use a lighter background and the yellow text is very difficult to read. I figure green is more conducive for both lighter and darker backgrounds and it keeps the Green Bay Packer theme. :) Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Friday, January 07, 2011 10:16 PM To: m5-dev@m5sim.org Subject: [m5-dev] changeset in m5: scons: show sources and targets when building, ... changeset b5003ac75977 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=b5003ac75977 description: scons: show sources and targets when building, and colorize output. I like the brevity of Ali's recent change, but the ambiguity of sometimes showing the source and sometimes the target is a little confusing. This patch makes scons typically list all sources and all targets for each action, with the common path prefix factored out for brevity. It's a little more verbose now but also more informative. Somehow Ali talked me into adding colors too, which is a whole 'nother story. diffstat: SConstruct | 114 ++- - src/SConscript | 32 +- src/arch/SConscript|2 +- src/arch/isa_parser.py |2 - src/python/SConscript |1 + src/python/m5/util/terminal.py | 113 6 files changed, 227 insertions(+), 37 deletions(-) diffs (truncated from 442 to 300 lines): diff -r 9f9e10967912 -r b5003ac75977 SConstruct --- a/SConstruct Tue Jan 04 21:40:49 2011 -0600 +++ b/SConstruct Fri Jan 07 21:50:13 2011 -0800 @@ -1,5 +1,6 @@ # -*- mode:python -*- +# Copyright (c) 2011 Advanced Micro Devices, Inc. # Copyright (c) 2009 The Hewlett-Packard Development Company # Copyright (c) 2004-2005 The Regents of The University of Michigan # All rights reserved. @@ -120,6 +121,18 @@ from m5.util import compareVersions, readCommand +AddOption('--colors', dest='use_colors', action='store_true') +AddOption('--no-colors', dest='use_colors', action='store_false') +use_colors = GetOption('use_colors') + +if use_colors: +from m5.util.terminal import termcap elif use_colors is None: +# option unspecified; default behavior is to use colors iff isatty +from m5.util.terminal import tty_termcap as termcap +else: +from m5.util.terminal import no_termcap as termcap + ## ## # # Set up the main build environment. @@ -357,7 +370,7 @@ # the ext directory should be on the #includes path main.Append(CPPPATH=[Dir('ext')]) -def _STRIP(path, env): +def strip_build_path(path, env): path = str(path) variant_base = env['BUILDROOT'] + os.path.sep if path.startswith(variant_base): @@ -366,29 +379,94 @@ path = path[6:] return path -def _STRIP_SOURCE(target, source, env, for_signature): -return _STRIP(source[0], env) -main['STRIP_SOURCE'] = _STRIP_SOURCE +# Generate a string of the form: +# common/path/prefix/src1, src2 - tgt1, tgt2 +# to print while building. +class Transform(object): +# all specific color settings should be here and nowhere else +tool_color = termcap.Normal +pfx_color = termcap.Yellow +srcs_color = termcap.Yellow + termcap.Bold +arrow_color = termcap.Blue + termcap.Bold +tgts_color = termcap.Yellow + termcap.Bold -def _STRIP_TARGET(target, source, env, for_signature): -return _STRIP(target[0], env) -main['STRIP_TARGET'] = _STRIP_TARGET +def __init__(self, tool, max_sources=99): +self.format = self.tool_color + ( [%8s] % tool) \ + + self.pfx_color + %s \ + + self.srcs_color + %s \ + + self.arrow_color + - \ + + self.tgts_color + %s \ + + termcap.Normal +self.max_sources = max_sources + +def __call__(self, target, source, env, for_signature=None): +# truncate source list according to max_sources param +source = source[0:self.max_sources] +def strip(f): +return strip_build_path(str(f), env) +if len(source) 0: +srcs = map(strip, source) +else: +srcs = [''] +tgts = map(strip, target) +# surprisingly, os.path.commonprefix is a dumb char-by-char string +# operation that has nothing to do with paths. +com_pfx = os.path.commonprefix(srcs + tgts) +com_pfx_len = len(com_pfx) +if com_pfx: +# do some cleanup and sanity checking on common prefix +if com_pfx[-1] == .: +# prefix matches all but file extension: ok +
Re: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol
FYI...If my local regression tests are correct. This patch does not fix all the problems with the MESI_CMP_directory protocol. One of the patches I just checked in fixes a subtle bug in the ruby_mem_test. Fixing this bug, exposes more deadlock problems in the MESI_CMP_directory protocol. To reproduce the regression tester's sequencer deadlock error, set the Randomization flag to false in the file configs/example/ruby_mem_test.py then run the following command: build/ALPHA_SE_MESI_CMP_directory/m5.debug configs/example/ruby_mem_test.py -n 8 Let me know if you have any questions, Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, January 13, 2011 8:50 PM To: m5-dev@m5sim.org Subject: [m5-dev] changeset in m5: Ruby: Fixes MESI CMP directory protocol changeset 8f37a23e02d7 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=8f37a23e02d7 description: Ruby: Fixes MESI CMP directory protocol The current implementation of MESI CMP directory protocol is broken. This patch, from Arkaprava Basu, fixes the protocol. diffstat: src/mem/protocol/MESI_CMP_directory-L1cache.sm | 25 +++-- src/mem/protocol/MESI_CMP_directory- L2cache.sm | 25 - 2 files changed, 35 insertions(+), 15 deletions(-) diffs (123 lines): diff -r 7107a2f3e53a -r 8f37a23e02d7 src/mem/protocol/MESI_CMP_directory-L1cache.sm --- a/src/mem/protocol/MESI_CMP_directory-L1cache.sm Thu Jan 13 12:30:18 2011 -0800 +++ b/src/mem/protocol/MESI_CMP_directory-L1cache.sm Thu Jan 13 22:17:11 2011 -0600 @@ -70,6 +70,7 @@ M_I, desc=L1 replacing, waiting for ACK; E_I, desc=L1 replacing, waiting for ACK; +SINK_WB_ACK, desc=This is to sink WB_Acks from L2; } @@ -749,9 +750,8 @@ l_popRequestQueue; } - transition(M_I, Inv, I) { + transition(M_I, Inv, SINK_WB_ACK) { ft_sendDataToL2_fromTBE; -s_deallocateTBE; l_popRequestQueue; } @@ -766,16 +766,14 @@ l_popRequestQueue; } - transition(M_I, Fwd_GETX, I) { + transition(M_I, Fwd_GETX, SINK_WB_ACK) { dt_sendDataToRequestor_fromTBE; -s_deallocateTBE; l_popRequestQueue; } - transition(M_I, {Fwd_GETS, Fwd_GET_INSTR}, I) { + transition(M_I, {Fwd_GETS, Fwd_GET_INSTR}, SINK_WB_ACK) { dt_sendDataToRequestor_fromTBE; d2t_sendDataToL2_fromTBE; -s_deallocateTBE; l_popRequestQueue; } @@ -865,6 +863,21 @@ s_deallocateTBE; o_popIncomingResponseQueue; } + + transition(SINK_WB_ACK, {Load, Store, Ifetch, L1_Replacement}){ + z_recycleMandatoryQueue; + + } + + transition(SINK_WB_ACK, Inv){ +fi_sendInvAck; +l_popRequestQueue; + } + + transition(SINK_WB_ACK, WB_Ack){ +s_deallocateTBE; +o_popIncomingResponseQueue; + } } diff -r 7107a2f3e53a -r 8f37a23e02d7 src/mem/protocol/MESI_CMP_directory-L2cache.sm --- a/src/mem/protocol/MESI_CMP_directory-L2cache.sm Thu Jan 13 12:30:18 2011 -0800 +++ b/src/mem/protocol/MESI_CMP_directory-L2cache.sm Thu Jan 13 22:17:11 2011 -0600 @@ -734,11 +734,13 @@ // BASE STATE - I // Transitions from I (Idle) - transition({NP, IS, ISS, IM, SS, M, M_I, MT_I, MCT_I, I_I, S_I, SS_MB, M_MB, MT_IIB, MT_IB, MT_SB}, L1_PUTX) { + transition({NP, IS, ISS, IM, SS, M, M_I, I_I, S_I, M_MB, MT_IB, MT_SB}, L1_PUTX) { +t_sendWBAck; jj_popL1RequestQueue; } - transition({NP, SS, M, MT, M_I, MT_I, MCT_I, I_I, S_I, IS, ISS, IM, SS_MB, M_MB, MT_IIB, MT_IB, MT_SB}, L1_PUTX_old) { + transition({NP, SS, M, MT, M_I, I_I, S_I, IS, ISS, IM, M_MB, MT_IB, MT_SB}, L1_PUTX_old) { +t_sendWBAck; jj_popL1RequestQueue; } @@ -968,6 +970,10 @@ mmu_markExclusiveFromUnblock; k_popUnblockQueue; } + + transition(MT_IIB, {L1_PUTX, L1_PUTX_old}){ +zz_recycleL1RequestQueue; + } transition(MT_IIB, Unblock, MT_IB) { nnu_addSharerFromUnblock; @@ -1015,21 +1021,22 @@ o_popIncomingResponseQueue; } + transition(MCT_I, {L1_PUTX, L1_PUTX_old}){ +zz_recycleL1RequestQueue; + } + // L1 never changed Dirty data transition(MT_I, Ack_all, M_I) { ct_exclusiveReplacementFromTBE; o_popIncomingResponseQueue; } - - // drop this because L1 will send data again - // the reason we don't accept is that the request virtual network may be completely backed up - // transition(MT_I, L1_PUTX) { - // jj_popL1RequestQueue; - //} + transition(MT_I, {L1_PUTX, L1_PUTX_old}){ +zz_recycleL1RequestQueue; + } // possible race between unblock and immediate replacement - transition(MT_MB, {L1_PUTX, L1_PUTX_old}) { + transition({MT_MB,SS_MB}, {L1_PUTX, L1_PUTX_old}) { zz_recycleL1RequestQueue; } ___ m5-dev mailing list m5-dev@m5sim.org
Re: [m5-dev] PerfectSwitch
Hi Nilay, Yes, you could make such an optimization, but you want to be careful not to introduce starvation. You want to make sure that newly arriving messages are not always prioritized over previously stalled messages. Could you avoid looping through all message buffers by creating a list of ready messages and simply scanning that instead? You still want to store the messages in the message buffers because they model the virtual channel storage. However, the list can be what the wakeup function actually scans. Does that make sense to you, or am I overlooking something? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, February 03, 2011 10:23 AM To: M5 Developer List Subject: Re: [m5-dev] PerfectSwitch On Thu, 3 Feb 2011, Nilay Vaish wrote: I implemented this approach. But it did not improve the performance. So I tried to explore what could be the cause. The function PerfectSwitch::wakeup() contains three loops. loop on number of virtual networks loop on number of incoming links loop till all messages for this (link, network) have been routed I am working with an 8 processor mesh network and run ruby_random_test.py for 400,000 loads. About 11-12% of the time is taken by this function, which is the highest amongst all the functions. I moved the third loop to another function. I found that the wakeup function is itself called about 76,000,000 times, number of messages processed is about 81,000,000. Out of these about 71,000,000 have destination count = 1. Surprisingly the inner loop, that I had separated out as a function, was called 3,600,000,000 times. That is about 45 times per invocation of the wakeup function, when each invocation of the wakeup function processes just about one message. When is the wakeup function called? Is it called in a periodic fashion? Or when a message needs to routed? Is it possible that instead of looking at all the virtual networks and links, we look at only those that have messages that need routing? I found that wakeup is scheduled only when a message needs to be routed. This is done using the consumer pointer. So, we need to some how inform the switch when ever wake up event happens, following link, networks need to be looked at. But this would mean a change in the Consumer class and in the RubyEvent class. Should we add a new parameter to the scheduling function, which would be some information that the wakeup function receives? -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: ruby: support to stallAndWait the mandatory queue
Thanks Arka for that response. You summed it up well. There are just a couple additional things I want to point out: 1. One thing that makes this mechanism work is that one must rank each input port. In other words, the programmer must understand and communicate the dependencies between message classes/protocol virtual channels. That way the correct messages are woken up when the appropriate event occurs. 2. In Nilay's example, you want to make sure that you don't delay the issuing of request A until the replacement of block B completes. Instead, request A should allocate a TBE and issue in parallel with replacing B. The mandatory queue is popped only when the cache message is consumed. When the cache message is stalled, it is basically moved to a temporary data structure with the message buffer where it waits until a higher priority message of the same cache block wakes it up. Brad From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Arkaprava Basu Sent: Saturday, January 22, 2011 10:49 AM To: M5 Developer List Cc: Gabe Black; Ali Saidi Subject: Re: [m5-dev] Review Request: ruby: support to stallAndWait the mandatory queue Hi Nilay, You are mostly correct. I believe this patch contains two things 1. Support in SLICC to allow waiting and stalling on messages in message buffer when the directory is in blocking state for that address (i.e. can not process the message at this point), until some event occurred that can make consumption of the message possible. When the directory unblocks, it provides the support for waking up the messages that were hitherto waiting (this is the precise reason why u did not see pop of mandatory queue, but see WakeUpAllDependants). 2. It contains changes to MOESI_hammer protocol that leverages this support. For the purpose of this particular discussion, the 1st part is the relevant one. As far as I understand, the support in SLICC for waiting and stalling was introduced primarily to enhance fairness in the way SLICC handles the coherence requests. Without this support when a message arrives to a controller in blocking state, it recycles, which means it polls again (and thus looks up again) in 10 cycles (generally recycle latency is set to 10). If there are multiple messages arrive while the controller was blocking state for a given address, you can easily see that there is NO fairness. A message that arrived latest for the blocking address can be served first when the controller unblocks. With the new support for stalling and waiting, the blocked messages are put in a FIFO queue and thus providing better fairness. But as you have correctly guessed, another major advantage of this support is that it reduces unnecessary lookups to the cache structure that happens due to polling (a.k.a recycle). So in summary, I believe that the problem you are seeing with too many lookups will *reduce* when the protocols are adjusted to take advantage of this facility. On related note, I should also mention that another fringe benefit of this support is that it helps in debugging coherence protocols. With this, coherence protocol traces won't contains thousands of debug messages for recycling, which can be pretty annoying for the protocol writers. I hope this helps, Thanks Arka On 01/22/2011 06:40 AM, Nilay Vaish wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/408/#review797 --- I was thinking about why the ratio of number of memory lookups, as reported by gprof, and the number of memory references, as reported in stats.txt. While I was working with the MESI CMP directory protocol, I had seen that the same request from the processor is looked up again and again in the cache, if the request is waiting for some event to happen. For example, suppose a processor asks for loading address A, but the cache has no space for holding address A. Then, it will give up some cache block B before it can bring in address A. The problem is that while the cache block B is being given, it is possible that the request made for address A is looked up in the cache again, even though we know it is not possible that we would find it in the cache. This is because the requests in the mandatory queue are recycled till they get done with. Clearly, we should remove the request for bringing in address A to a separate structure, instead of looking it up again and again. The new structure should be looked up whenever an event, that could possibly affect the status of this request, occurs. If we do this, then I think we should see a further reduction in the number of lookups. I would expect almost 90% of the lookups to the cache to go away. This should also mean a 5% improvement in simulator performance. Brad, do agree
Re: [m5-dev] Error in Simulating Mesh Network
Yes, but right now my repo is a couple weeks behind the main repo and I'd rather get all these patches resolved first, then sync up with main repo and do my final regression testing once. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Gabe Black Sent: Saturday, January 22, 2011 2:26 AM To: M5 Developer List Subject: Re: [m5-dev] Error in Simulating Mesh Network You should be able to move that around any other patches ahead of it, right? It's so simple I wouldn't expect it to really depend on the intervening patches. Gabe Beckmann, Brad wrote: Hi Nilay, Yes, I am aware of this problem and one of the patches (http://reviews.m5sim.org/r/381/) I'm planning to check in does fix this. Unfortunately, those patches are being hung up because I need to do some more work on another one of them and right now I don't have any time to do so. As you can see from the patch, it is a very simple fix, so you may want to do it locally if it blocking you. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev- boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, January 20, 2011 6:16 AM To: m5-dev@m5sim.org Subject: [m5-dev] Error in Simulating Mesh Network Brad, I tried simulating a mesh network with four processors. ./build/ALPHA_FS_MOESI_hammer/m5.prof ./configs/example/ruby_fs.py - -maxtick 2000 -n 4 --topology Mesh --mesh-rows 2 --num-l2cache 4 --num-dir 4 I receive the following error: panic: FIFO ordering violated: [MessageBuffer: consumer-yes [ [71227521, 870, 1; ] ]] [Version 1, L1Cache, triggerQueue_in] name: [Version 1, L1Cache, triggerQueue_in] current time: 71227512 delta: 1 arrival_time: 71227513 last arrival_time: 71227521 @ cycle 35613756000 [enqueue:build/ALPHA_FS_MOESI_hammer/mem/ruby/buffers/MessageB uffer.cc, line 198] Do you think that the options I have specified should work correctly? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Error in Simulating Mesh Network
Hi Nilay, Yes, I am aware of this problem and one of the patches (http://reviews.m5sim.org/r/381/) I'm planning to check in does fix this. Unfortunately, those patches are being hung up because I need to do some more work on another one of them and right now I don't have any time to do so. As you can see from the patch, it is a very simple fix, so you may want to do it locally if it blocking you. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, January 20, 2011 6:16 AM To: m5-dev@m5sim.org Subject: [m5-dev] Error in Simulating Mesh Network Brad, I tried simulating a mesh network with four processors. ./build/ALPHA_FS_MOESI_hammer/m5.prof ./configs/example/ruby_fs.py - -maxtick 2000 -n 4 --topology Mesh --mesh-rows 2 --num-l2cache 4 --num-dir 4 I receive the following error: panic: FIFO ordering violated: [MessageBuffer: consumer-yes [ [71227521, 870, 1; ] ]] [Version 1, L1Cache, triggerQueue_in] name: [Version 1, L1Cache, triggerQueue_in] current time: 71227512 delta: 1 arrival_time: 71227513 last arrival_time: 71227521 @ cycle 35613756000 [enqueue:build/ALPHA_FS_MOESI_hammer/mem/ruby/buffers/MessageB uffer.cc, line 198] Do you think that the options I have specified should work correctly? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] (no subject)
Hi Nilay, Yes, that is correct. There is a comment at the top of the file: src/mem/ruby/network/topologies/Mesh.py which says that very thing: # Makes a generic mesh assuming an equal number of cache and directory cntrls The function ensures that only the number of dma controllers is not a multiple of the number of routers. If you know of a better way to handle this, please let me know. Brad -Original Message- From: Nilay [mailto:ni...@cs.wisc.edu] Sent: Tuesday, January 18, 2011 9:28 PM To: Beckmann, Brad Cc: m5-dev@m5sim.org Subject: RE: Brad, I got the simulation working. It seems to me that you wrote Mesh.py under the assumption that number of cpus = number of L1 controllers = number of L2 controllers (if present) = number of directory controllers. The following options worked after some struggle and some help from Arka - ./build/ALPHA_FS_MESI_CMP_directory/m5.fast ./configs/example/ruby_fs.py --maxtick 20 -n 16 --topology Mesh -- mesh-rows 4 --num-dirs 16 --num-l2caches 16 -- Nilay On Tue, January 18, 2011 10:28 am, Beckmann, Brad wrote: Hi Nilay, My plan is to tackle the functional access support as soon as I check in our current group of outstanding patches. I'm hoping to at least check in the majority of them in the next couple of days. Now that you've completed the CacheMemory access changes, you may want to re-profile GEM5 and make sure the next performance bottleneck is routing network messages in the Perfect Switch. In particular, you'll want to look at rather large (16+ core) systems using a standard Mesh network. If you have any questions on how to do that, Arka may be able to help you out, if not, I can certainly help you. Assuming the Perfect Switch shows up as a major bottleneck ( 10%), then I would suggest that as the next area you can work on. When looking at possible solutions, don't limit yourself to just changes within Perfect Switch itself. I suspect that redesigning how destinations are encoded and/or the interface between MessageBuffer dequeues and the PerfectSwitch wakeup, will lead to a better solution. Brad -Original Message- From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Tuesday, January 18, 2011 6:59 AM To: Beckmann, Brad Cc: m5-dev@m5sim.org Subject: Hi Brad Now that those changes to CacheMemory, SLICC and protocol files have been pushed in, what's next that you think we should work on? I was going through some of the earlier emails. You have mentioned functional access support in Ruby, design of the Perfect Switch, consolidation of stat files. Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] (no subject)
Hi Nilay, My plan is to tackle the functional access support as soon as I check in our current group of outstanding patches. I'm hoping to at least check in the majority of them in the next couple of days. Now that you've completed the CacheMemory access changes, you may want to re-profile GEM5 and make sure the next performance bottleneck is routing network messages in the Perfect Switch. In particular, you'll want to look at rather large (16+ core) systems using a standard Mesh network. If you have any questions on how to do that, Arka may be able to help you out, if not, I can certainly help you. Assuming the Perfect Switch shows up as a major bottleneck ( 10%), then I would suggest that as the next area you can work on. When looking at possible solutions, don't limit yourself to just changes within Perfect Switch itself. I suspect that redesigning how destinations are encoded and/or the interface between MessageBuffer dequeues and the PerfectSwitch wakeup, will lead to a b etter solution. Brad -Original Message- From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Tuesday, January 18, 2011 6:59 AM To: Beckmann, Brad Cc: m5-dev@m5sim.org Subject: Hi Brad Now that those changes to CacheMemory, SLICC and protocol files have been pushed in, what's next that you think we should work on? I was going through some of the earlier emails. You have mentioned functional access support in Ruby, design of the Perfect Switch, consolidation of stat files. Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Question on SLICC
Nilay, Are you trying to replace CacheMsg with RubyRequest? I agree that we can probably get rid of one of them. If I recall, right now RubyRequest is defined in libruby.hh. Is the Ruby library interface still important to you all at Wisconsin? If not, I would like to get rid of the libruby files. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, January 18, 2011 10:45 AM To: M5 Developer List Subject: Re: [m5-dev] Question on SLICC Figured that out last night. I also noticed that there is comment about it in RubySlicc_Types.sm (should read files more carefully). Actually, I am trying to get rid of CacheMsg class. Currently, RubyRequest is created from packet (which I believe is an m5 primitive) and then a CacheMsg is created from RubyRequest. Thanks Nilay On Tue, 18 Jan 2011, nathan binkert wrote: There are certain types defined in the file src/mem/protocol/RubySlicc_Types.sm. For each of the type is .hh is gets written which contains the path of the actual header file to be used. For example, the file RubySlicc_Types.sm defines CacheMemory type. This type is actually defined in the file src/mem/ruby/system/CacheMemory.hh. When a protocol is compiled, the file build/protocol_name/mem/protocol/CacheMemory.hh gets written. This file contains just one line - #include path to CacheMemory.hh My question is which script writes this file. I have looked around but have not been able to figure it out yet. That gets done in src/mem/ruby/SConscript. The reason it gets done there is because the .hh file is actually in the system directory, but the way the slicc code is generated, it tries to include it from the protocol directory. In the original slicc/ruby, this didn't matter because all directories were in the include search path, but in M5 we need to know the path. There was no easy way to fix this, so this ugly band aid exists. Be awesome to get rid of it. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] EIO Regression Tests
Hi Nilay, I understand your confusion. This is an example of where the wiki needs to be updated. I believe the wiki only mentions the encumbered tar ball and doesn't mention the encumbered hg repo on repo.m5sim.org. As far as the anagram test program goes, I remember Lisa and I encountered the same issue a while back and to resolve it I believe Lisa copied that test along with several other regression tester programs from Michigan to AMD. I can provide you those regression tester programs, but at a higher level, I think this is a good time to ask the question on how we want to provide external users all the files necessary to run the regression tester? As Nilay points out, the encumbered repo has some, but not all of the necessary files. I believe, one also needs another set of regression tester programs which include both the anagram files, as well as the SPECCPU files for the long regression tester runs. Thoughts? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Monday, January 17, 2011 1:55 PM To: M5 Developer List Subject: Re: [m5-dev] EIO Regression Tests I figured that out, but there is no anagram directory in tests/test- progs. I, therefore, receive the following error: gzip: tests/test-progs/anagram/bin/alpha/eio/anagram-vshort.eio.gz: No such file or directory -- Nilay On Mon, 17 Jan 2011, Steve Reinhardt wrote: The one where the EIO code lives. That's it's name, at http://repo.m5sim.org. On Mon, Jan 17, 2011 at 12:59 PM, Nilay Vaish ni...@cs.wisc.edu wrote: What do you mean by the encumbered repository? On Mon, 17 Jan 2011, Steve Reinhardt wrote: Yes, it should be a concern... it should work. Did you do a pull on the encumbered repository? There were some changes there needed to maintain compatibility with the latest m5 dev repo. Otherwise you'll need to provide more detail about how things failed. Steve On Mon, Jan 17, 2011 at 10:21 AM, Nilay Vaish ni...@cs.wisc.edu wrote: I just ran the regression tests for the patch (deals with SLICC and cache coherence protocols) that I need to commit. The EIO tests fail. Should this be a concern? -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] EIO Regression Tests
Thanks Gabe. I had completely forgotten about the fact we can freely distribute some of those tests. You're suggestion on creating a second, shorter regression tester that focuses on testing different mechanisms sounds like a great idea. Hopefully we can get that done sometime. In the meantime, let's just make a note to update the wiki in the near future on the current procedure for running the regression tester, pointing people to the binaries that we can't distribute ourselves. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Gabriel Michael Black Sent: Monday, January 17, 2011 4:23 PM To: m5-dev@m5sim.org Subject: Re: [m5-dev] EIO Regression Tests I think there are two important aspects of this issue. 1. Using regression tests we can't distribute freely has some important limitations. It would be nice to replace them with ones we can. 2. The majority of the regression tests we have now are really benchmarks which provide basic coverage by working/not working and not changing behavior unexpectedly. That's an important element to have since it's a practical reality check and probably hits things we wouldn't think to test. They have significant limitations, though, since they take a long time to run and tend to exercise the same simulator functionality over and over. For instance, gcc may generate code that always has the same type of backward branch for a for loop. Using gzip as a test will verify that that branch works, but possibly not the slightly different variant that may, for instance, use a large branch displacement. Even when writing code in x86 assembly it can be impossible to predict which of the possibly many redundant instruction encodings the assembler might pick. So, in everyone's infinite free time, I think we should replace our benchmark based regressions with a smaller set of freely distributable regressions/inputs, and augment them with shorter, targeted tests that exercise particular mechanisms, circumstances, instructions, etc. Instead of replacing our existing benchmarks which are useful as actual benchmarks and are good to keep working, we could build up this second set of tests in parallel. Gabe Quoting Beckmann, Brad brad.beckm...@amd.com: Hi Nilay, I understand your confusion. This is an example of where the wiki needs to be updated. I believe the wiki only mentions the encumbered tar ball and doesn't mention the encumbered hg repo on repo.m5sim.org. As far as the anagram test program goes, I remember Lisa and I encountered the same issue a while back and to resolve it I believe Lisa copied that test along with several other regression tester programs from Michigan to AMD. I can provide you those regression tester programs, but at a higher level, I think this is a good time to ask the question on how we want to provide external users all the files necessary to run the regression tester? As Nilay points out, the encumbered repo has some, but not all of the necessary files. I believe, one also needs another set of regression tester programs which include both the anagram files, as well as the SPECCPU files for the long regression tester runs. Thoughts? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Monday, January 17, 2011 1:55 PM To: M5 Developer List Subject: Re: [m5-dev] EIO Regression Tests I figured that out, but there is no anagram directory in tests/test- progs. I, therefore, receive the following error: gzip: tests/test-progs/anagram/bin/alpha/eio/anagram-vshort.eio.gz: No such file or directory -- Nilay On Mon, 17 Jan 2011, Steve Reinhardt wrote: The one where the EIO code lives. That's it's name, at http://repo.m5sim.org. On Mon, Jan 17, 2011 at 12:59 PM, Nilay Vaish ni...@cs.wisc.edu wrote: What do you mean by the encumbered repository? On Mon, 17 Jan 2011, Steve Reinhardt wrote: Yes, it should be a concern... it should work. Did you do a pull on the encumbered repository? There were some changes there needed to maintain compatibility with the latest m5 dev repo. Otherwise you'll need to provide more detail about how things failed. Steve On Mon, Jan 17, 2011 at 10:21 AM, Nilay Vaish ni...@cs.wisc.edu wrote: I just ran the regression tests for the patch (deals with SLICC and cache coherence protocols) that I need to commit. The EIO tests fail. Should this be a concern? -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] MOESI_CMP_token
Hi Nilay, There is often a tradeoff between doing operations in actions versus the input port. Overall, I agree with you that we should concentrate on doing most/all operations in actions, not the input ports. The input port logic is often a confusing nested conditional mess and performing other operations inside the input port logic only further confuses things. I believe the reason why the ExternalResponse is monitored at the input port for the token protocol, is because this is a critical piece of information needed for tuning the dynamic timeout latency. It is likely that Mike Marty (who I believe is the original author) just wanted to make sure he always correctly identified external responses. My suggestion for you is not to worry about it and just keep the logic as is. There is no need to give yourself extra work. To my knowledge GEM5 has yet to be configured into multiple chips and most of the ExternalResponse logic deals with separating local cache hits vs. remote cache hits. Once we configure multiple chip systems, we can revisit the ExternalResponse logic and possibly optimize it. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Sent: Friday, January 14, 2011 1:12 AM To: m5-dev@m5sim.org Subject: [m5-dev] MOESI_CMP_token I am trying to update the MOESI CMP token protocol. Line 563 in the file for the L1 cache controller caught my eye. While processing a message received through the response network, the transaction buffer entry for the address is edited. tbe.ExternalResponse := true; Should this happen where it is happening currently? I think this change should appear in some action. -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Updating MOESI CMP Directory protocol as per the new interface
Hi Nilay, Yes, please add the OOD token. I believe that will come in handy when developing new protocols. Don’t worry about separating out that RequestorMachine change. It seems like just a few extra lines. Also I believe the MOESI_CMP_Directory protocol did work correctly before your change, right? If so, the RequestorMachine lines are related to the rest of the patch. Brad From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Thursday, January 13, 2011 8:57 AM To: Nilay Vaish; Default; Beckmann, Brad Subject: Re: Review Request: Updating MOESI CMP Directory protocol as per the new interface This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/359/ On January 13th, 2011, 8:48 a.m., Brad Beckmann wrote: src/mem/protocol/MOESI_CMP_directory-L1cache.smhttp://reviews.m5sim.org/r/359/diff/8/?file=9537#file9537line159 (Diff revision 8) 155 if (L1DcacheMemory.isTagPresent(addr)) { 157 return L1Icache_entry; So the assumption here is the L1IcacheMemory.lookup() call either returns the L1I cache entry or NULL/OOD, correct? Does SLICC also support explicitly passing back OOD? Currently, SLICC does not have support for Out Of Domain (OOD) token. But I can add that as I had done earlier. I am not sure if we actually need it. On January 13th, 2011, 8:48 a.m., Brad Beckmann wrote: src/mem/protocol/MOESI_CMP_directory-L1cache.smhttp://reviews.m5sim.org/r/359/diff/8/?file=9537#file9537line465 (Diff revision 8) 430 out_msg.RequestorMachine := MachineType:L1Cache; This seems like an unrelated change, correct. However it is pretty minor, so don't worry about it. IIRC, this is necessary or else a certain panic state is reached. I think I should separately make this change. - Nilay On January 12th, 2011, 10:44 p.m., Nilay Vaish wrote: Review request for Default. By Nilay Vaish. Updated 2011-01-12 22:44:50 Description This is a request for reviewing the proposed changes to the MOESI CMP directory cache coherence protocol to make it conform with the new cache memory interface and changes to SLICC. Testing These changes have been tested using the Ruby random tester. The tester was used with -l = 1048576 and -n = 2. Diffs * src/mem/protocol/MOESI_CMP_directory-L1cache.sm (c6bc8fe81e79) * src/mem/protocol/MOESI_CMP_directory-L2cache.sm (c6bc8fe81e79) * src/mem/protocol/MOESI_CMP_directory-dir.sm (c6bc8fe81e79) * src/mem/protocol/MOESI_CMP_directory-dma.sm (c6bc8fe81e79) View Diffhttp://reviews.m5sim.org/r/359/diff/ ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Checkpoint Tester Problems
Well I just realized that I don't have permissions to add new bug reports to Flyspray. My Flyspray user id is beckmabd if anyone would like to grant me permissions. Thanks! The checkpoint tester is a script located in util/checkpoint_test.py that Ali recently pointed me to. The script is commented well and fully describes what it does and how to run it. When I run a small test using X86_FS, the script identifies the following mismatches: Cmd: util/checkpoint-tester.py -i 2000 -- build/ALPHA_FS_MOESI_hammer/m5.debug configs/example/fs.py --script test/halt.sh Diff output: --- checkpoint-test/m5out/cpt.1/m5.cpt Wed Jan 12 14:59:28 2011 +++ checkpoint-test/test.4/cpt.1/m5.cpt Wed Jan 12 15:00:42 2011 @@ -10,20 +10,20 @@ so_state=2 locked=false _status=1 -instCnt=10 +instCnt=9 [system.cpu.xc.0] _status=0 -funcExeInst=16 +funcExeInst=15 quiesceEndTick=0 iplLast=0 iplLastTick=0 floatRegs.i=0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 -intRegs=549755813888 0 2097152 0 0 0 590336 0 0 0 0 0 0 0 0 0 0 2097208 380 0 0 0 0 2097189 0 0 0 0 0 0 0 0 133 0 0 0 0 0 0 +intRegs=549755813888 0 2097152 0 0 0 590336 0 0 0 0 0 0 0 0 0 18446743523955834880 2097182 380 0 0 0 0 2097189 0 0 0 0 0 0 0 0 133 0 0 0 0 0 0 _pc=2097202 -_npc=2097208 -_upc=1 -_nupc=2 +_npc=2097210 +_upc=0 +_nupc=1 regVal=3758096401 0 0 458752 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 4294905840 1024 2 243392 0 1288 0 0 0 260 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1974748653749254 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1280 0 0 0 0 0 0 0 0 0 0 0 0 0 0 132609 0 0 0 0 67108864 0 0 0 0 0 16 8 16 16 16 16 0 0 0 0 0 24 0 0 0 0 0 0 0 0 0 483328 0 0 0 0 0 0 0 0 0 0 0 0 483328 0 0 0 0 983295 983295 983295 983295 983295 983295 65535 65535 23 65535 65535 983295 655 35 45768 43728 45768 45768 45768 45768 45952 0 45952 45952 45952 43976 45952 0 0 0 0 0 0 0 0 0 0 0 4 276095232 0 [system.cpu.tickEvent] By the way, could we add this test to the regression tester? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Gabe Black Sent: Wednesday, January 12, 2011 4:42 PM To: M5 Developer List Subject: Re: [m5-dev] Checkpoint Tester Problems Flyspray would be good. We don't use it like we should, but it's probably the most appropriate place. I'm not familiar with the checkpoint tester. How does it work (link to the wiki would be fine), and what were the differences? Gabe Beckmann, Brad wrote: Hi All, While using the checkpoint tester script, I noticed that at least X86_FS with the atomic + classic memory system encounters differences in the checkpoint state. The good news is that none of the patches I have out for review add any more checkpoint differences, but we still should track down the existing bugs at some point. Should I use flyspray to document the bugs, or would you prefer me to document these bugs some other way? Thanks, Brad -- -- ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC
Are you sure you would call the above piece of code as __implicit__ setting of cache and tbe entry variables? In this case, the local variable has been __explicitly__ passed in the call to the trigger function. To me 'X is implicit' means that the programmer does not need to write 'X' in the protocol file for the compiler. For example, currently trigger function implicitly passes the state of the address as a parameter. Such code is possible, my only concern is that once the variable is set, it cannot be used again on the left hand side of the assignment operator. Entry local_var := getL1ICacheEntry(in_msg.LineAddress) /* Do some thing*/ local_var := getL1DCacheEntry(in_msg.LineAddress) This SLICC code will not generate correct C++ code, since we assume that a pointer variable can only be used in its dereferenced form, except when passed on as a parameter in a function call. Yeah, I think we were confusing each other before because implicit was meaning different things. When I said implicitly passes the cache entry, I meant that relative to the actions, not the trigger function. As you mentioned, the input state is an implicit parameter to the trigger function, but the address is an explicit parameter to the trigger function and an implicit parameter to the actions. You were thinking the former and we were thinking the latter. Now I think we are on the same page. Actually I was thinking that we only dereference the cache_entry pointer when we reference a member of the class. I haven't thought this all the way through, but is that possible? Then such an assignment would work. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
[m5-dev] Checkpoint Tester Problems
Hi All, While using the checkpoint tester script, I noticed that at least X86_FS with the atomic + classic memory system encounters differences in the checkpoint state. The good news is that none of the patches I have out for review add any more checkpoint differences, but we still should track down the existing bugs at some point. Should I use flyspray to document the bugs, or would you prefer me to document these bugs some other way? Thanks, Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC
Hi Nilay, Sure, using a local variable to further reduce the calls to getCacheEntry is a great idea. I think that is orthogonal to the suggestion I was making. I just want the ability to directly set the cache_entry and tbe_entry variables in the trigger function. That way the address, cache_entry, and tbe_entry variables are dealt with consistently and it avoids adding the separate calls to set_cache_entry() and set_tbe () in the inports. Brad -Original Message- From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Friday, January 07, 2011 11:40 AM To: Beckmann, Brad Cc: Default Subject: RE: Review Request: Changing how CacheMemory interfaces with SLICC Brad, my comments are inline. On Fri, 7 Jan 2011, Beckmann, Brad wrote: Hi Nilay, Unfortunately I can't provide you an example of a protocol where getCacheEntry behaves in a different manner, but they do exist. I reviewed your most recent patch updates and I don't think what we're asking for is much different than what you have on reviewboard right now. Basically, all we need to do is add back in the capability for the programmer to write their own getCacheEntry function in the .sm file. I know that I initially asked you to automatically generate those functions, and I still think that is useful for most protocols, but Lisa made me realize that we need customized getCacheEntry functions as well. Also we may want to change the name of generated getCacheEntry function to getExclusiveCacheEntry so that one realizes the exclusive assumption made by the function. Other than that, the only other change I suggest is to allow the trigger function to directly set the implicit cache_entry and tbe_entry variables. Below is example of what I'm envisioning: [Nilay] If we do things in this way, then any in_port, in which cache / tb entries are accessed before the trigger function, would still make calls to isCacheTagPresent(). Currently in MOESI_CMP_directory-L1cache.sm: in_port(useTimerTable_in, Address, useTimerTable) { if (useTimerTable_in.isReady()) { set_cache_entry(getCacheEntry(useTimerTable.readyAddress())); set_tbe(TBEs[useTimerTable.readyAddress()]); trigger(Event:Use_Timeout, useTimerTable.readyAddress()); } } Replace that with the following: in_port(useTimerTable_in, Address, useTimerTable) { if (useTimerTable_in.isReady()) { trigger(Event:Use_Timeout, useTimerTable.readyAddress(), getExclusiveCacheEntry(useTimerTable.readyAddress()), TBEs[useTimerTable.readyAddress()]); } } [Nilay] Instead of passing cache and tb entries as arguments, we can create local variables in the trigger function using the address argument. Please let me know if you have any questions. Thanks...you're almost done. :) Brad Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC
Sure, using a local variable to further reduce the calls to getCacheEntry is a great idea. I think that is orthogonal to the suggestion I was making. I just want the ability to directly set the cache_entry and tbe_entry variables in the trigger function. That way the address, cache_entry, and tbe_entry variables are dealt with consistently and it avoids adding the separate calls to set_cache_entry() and set_tbe () in the inports. Firstly, we have to set cache and transaction buffer entry variables whenever we do allocation or deallocation of entries. This means these calls cannot be completely avoided. Secondly, while processing events from the mandatory queue (as it is called in the current implementations), if these variables are not set, we will have to revert to the earlier approach. This would double the number of times cache entry lookups are performed as the trigger function will perform the lookup again. This would also mean that both the approaches for looking up cache entry in the cache will have to exist simultaneously. Absolutely, we still need the ability to allocate or deallocate entries within actions. I'm not advocating to completely eliminate the set/unset cache and tbe entry functions. I just want to avoid including those calls in the inports. I'm confused why the mandatory queue is different than other queues. They all trigger events in the same way. Maybe I should point out that I'm assuming that getCacheEntry can return a NULL pointer and thus that can be passed into the trigger call when no cache or tbe entry exists. Another concern is in implementation of getCacheEntry(). If this function has to return a pointer to a cache entry, we would have to provide support for local variables which internally SLICC would assume to be pointer variables. Within SLICC understanding that certain variables are actually pointers is a little bit of a nuisance, but there already exists examples where we make that distinction. For instance, look at the if para.pointer conditionals in StateMachine.py. We just have to treat cache and tbe entries in the same fashion. In my opinion, we should maintain one method for looking up cache entries. My own experience informs me that it is not difficult to incorporate calls to set/unset_cache_entry () in already existing protocol implementations. For implementing new protocols, I think the greater obstacle will be in implementing the protocol correctly and not in using entry variables correctly. If we document this change lucidly, there is no reason to believe a SLICC programmer will be exceptionally pushed because of this change. Assuming that this change does introduce some complexity in progamming with SLICC, does that complexity out weigh the performance improvements? My position is we can leverage SLICC as an intermediate language and achieve the performance benefits of your change without significantly impacting the programmability. I agree that we need the set/unset_cache_entry calls in the allocate and deallocate actions. I see no problem with that. I just want to treat these new implicit cache and tbe entry variables like the existing implicit variable address. Therefore I want to pass them into the trigger operation like the address variable. I also want just one method for looking up cache entries. I believe the only difference is that I would like to set the cache and tbe entries in the trigger function, as well as allowing them to be set in the actions. I hope that clarifies at least what I'm envisioning. I appreciate your feedback on this and I want to reiterate that I think your change is really close to being done. If you still feel like I'm missing something, I would be happy to chat with you over-the-phone. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC
Hi Nilay, Overall, I believe we are in more agreement with each other than maybe you think. I'm glad you included pseudo code in your latest email. That is a great idea. I think part of our problem is we are comprehending our textual descriptions in different ways. Below are my responses: Absolutely, we still need the ability to allocate or deallocate entries within actions. I'm not advocating to completely eliminate the set/unset cache and tbe entry functions. I just want to avoid including those calls in the inports. I'm confused why the mandatory queue is different than other queues. They all trigger events in the same way. if (L1IcacheMemory.isTagPresent(in_msg.LineAddress)) { // The tag matches for the L1, so the L1 asks the L2 for it. trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress); } Brad, mandatory queue is just an example where an inport may perform tag lookup before cache and transaction buffer entries has been set. Above is an excerpt from the file MOESI_CMP_directory-L1cache.sm. Before the trigger() is called, isTagPresent() is called. This means tag look up is being performed before cache or transaction buffer entries have been set. Suppose the tag was present in L1Icache, then in the trigger() call, we will again perform lookup. Similarly, there is an inport in the Hammer's protocol implementation where getCacheEntry() is called before a call to trigger(). Now, why should we use getCacheEntry() in the inport and cache entry in the action? The reason is, as you pointed out, we ideally want to call getCacheEntry once. I believe your suggestion to use local variables in the input ports gets us there. Below is what I'm envisioning for the MOESI_hammer mandatory queue in_port logic (at least the IFETCH half of the logic): ENTRY getL1ICacheEntry(Address addr) { assert(is_valid(L1DcacheMemory.lookup(addr) == FALSE); assert(is_valid(L2cacheMemory.lookup(addr) == FALSE); return L1IcacheMemory.lookup(addr); } ENTRY getL1DCacheEntry(Address addr) { assert(is_valid(L1IcacheMemory.lookup(addr) == FALSE); assert(is_valid(L2cacheMemory.lookup(addr) == FALSE); return L1DcacheMemory.lookup(addr); } ENTRY getL2CacheEntry(Address addr) { assert(is_valid(L1IcacheMemory.lookup(addr) == FALSE); assert(is_valid(L1DcacheMemory.lookup(addr) == FALSE); return L2cacheMemory.lookup(addr); } in_port(mandatoryQueue_in, CacheMsg, mandatoryQueue, desc=..., rank=0) { if (mandatoryQueue_in.isReady()) { peek(mandatoryQueue_in, CacheMsg, block_on=LineAddress) { // Set the local entry variables ENTRY L1I_cache_entry = getL1ICacheEntry(in_msg.LineAddress); ENTRY L1D_cache_entry = getL1DCacheEntry(in_msg.LineAddress); TBE_Entry tbe_entry = getTBE(in_msg.LineAddress); // Check for data access to blocks in I-cache and ifetchs to blocks in D-cache if (in_msg.Type == CacheRequestType:IFETCH) { // ** INSTRUCTION ACCESS *** // Check to see if it is in the OTHER L1 if (is_valid(L1D_cache_entry)) { // The block is in the wrong L1, try to write it to the L2 if (L2cacheMemory.cacheAvail(in_msg.LineAddress)) { trigger(Event:L1_to_L2, in_msg.LineAddress, L1D_cache_entry, tbe_entry); } else { replace_addr = L2cacheMemory.cacheProbe(in_msg.LineAddress); replace_cache_entry = getL2CacheEntry(replace_addr); replace_tbe_entry = getTBE(replace_addr); trigger(Event:L2_Replacement, replace_addr, replace_cache_entry, replace_tbe_entry); } } if (is_valid(L1I_cache_entry)) { // The tag matches for the L1, so the L1 fetches the line. We know it can't be in the L2 due to exclusion trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress, L1I_cache_entry, tbe_entry); } else { if (L1IcacheMemory.cacheAvail(in_msg.LineAddress)) { // L1 does't have the line, but we have space for it in the L1 ENTRY L2_cache_entry = getL2CacheEntry(in_msg.LineAddress); if (is_valid(L2_cache_entry)) { // L2 has it (maybe not with the right permissions) trigger(Event:Trigger_L2_to_L1I, in_msg.LineAddress, L2_cache_entry, tbe_entry); } else { // We have room, the L2 doesn't have it, so the L1 fetches the line trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress, L1Icache_entry, tbe_entry); // you could also say here: trigger(mandatory_request_type_to_event(in_msg.Type), in_msg.LineAddress, ODD, ODD); } } else { // No room in the L1, so we need to make room if (L2cacheMemory.cacheAvail(L1IcacheMemory.cacheProbe(in_msg.LineAddress))) {
Re: [m5-dev] Review Request: Ruby: Update the Ruby request type names for LL/SC
Oops...I forgot to include the -o option when updating it. I just uploaded a new patch...try it again. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Sent: Monday, January 10, 2011 9:01 AM To: M5 Developer List Subject: Re: [m5-dev] Review Request: Ruby: Update the Ruby request type names for LL/SC Brad, this patch also did not apply cleanly. I think the patches that you are trying to upload do not follow git's style. On Mon, January 10, 2011 10:52 am, Brad Beckmann wrote: --- This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/391/ --- (Updated 2011-01-10 08:52:22.568922) Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. Summary --- Ruby: Update the Ruby request type names for LL/SC Diffs (updated) - src/mem/ruby/libruby.hh b5d03e87db4e src/mem/ruby/libruby.cc b5d03e87db4e src/mem/ruby/recorder/TraceRecord.cc b5d03e87db4e src/mem/ruby/system/DMASequencer.cc b5d03e87db4e src/mem/ruby/system/RubyPort.cc b5d03e87db4e src/mem/ruby/system/Sequencer.cc b5d03e87db4e Diff: http://reviews.m5sim.org/r/391/diff Testing --- Thanks, Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC
Hi Nilay, Unfortunately I can't provide you an example of a protocol where getCacheEntry behaves in a different manner, but they do exist. I reviewed your most recent patch updates and I don't think what we're asking for is much different than what you have on reviewboard right now. Basically, all we need to do is add back in the capability for the programmer to write their own getCacheEntry function in the .sm file. I know that I initially asked you to automatically generate those functions, and I still think that is useful for most protocols, but Lisa made me realize that we need customized getCacheEntry functions as well. Also we may want to change the name of generated getCacheEntry function to getExclusiveCacheEntry so that one realizes the exclusive assumption made by the function. Other than that, the only other change I suggest is to allow the trigger function to directly set the implicit cache_entry and tbe_entry variables. Below is example of what I'm envisioning: Currently in MOESI_CMP_directory-L1cache.sm: in_port(useTimerTable_in, Address, useTimerTable) { if (useTimerTable_in.isReady()) { set_cache_entry(getCacheEntry(useTimerTable.readyAddress())); set_tbe(TBEs[useTimerTable.readyAddress()]); trigger(Event:Use_Timeout, useTimerTable.readyAddress()); } } Replace that with the following: in_port(useTimerTable_in, Address, useTimerTable) { if (useTimerTable_in.isReady()) { trigger(Event:Use_Timeout, useTimerTable.readyAddress(), getExclusiveCacheEntry(useTimerTable.readyAddress()), TBEs[useTimerTable.readyAddress()]); } } Please let me know if you have any questions. Thanks...you're almost done. :) Brad -Original Message- From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Thursday, January 06, 2011 6:32 AM To: Beckmann, Brad Cc: Default Subject: RE: Review Request: Changing how CacheMemory interfaces with SLICC Can you give me an example of a protocol where getCacheEntry() behaves in a different manner? -- Nilay On Wed, 5 Jan 2011, Beckmann, Brad wrote: Hi Nilay, Lisa Hsu (another member of the lab here at AMD) and I were discussing these changes a bit more and there was one particular idea that came out of our conversation that I wanted to relay to you. Basically, we were thinking about how these changes will impact the flexibility of SLICC and we concluded that it is important to allow one to craft custom getCacheEntry functions for each protocol. I know initially I was hoping to generate these functions, but I now don’t think that is possible without restricting what protocols can be support by SLICC. Instead we can use these customized getCacheEntry functions to pass the cache entry to the actions via the trigger function. For those controllers that manage multiple cache memories, it is up to the programmer to understand what the cache entry pointer points to. That should eliminate the need to have multiple *cacheMemory_entry variables in the .sm files. Instead there is just the cache_entry variable that is set either by the trigger function call or set_cache_entry. Does that make sense to you? Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh
1. Below is a snip of a protocol trace that I recently used. I think it is important for us maintain that there is no DPRINTF information prepended to each line. The initial motivation for the protocol trace, was that tracing protocol transitions using standard debug print was too verbose. These traces can be 100’MB if not GBs in size, so reducing the information printed to each line is important. Nilay, could you send a snip of the trace with the patch applied? 2233850 3L1CacheLoad IIS [0x409ec0, line 0x409ec0] 2233850 3L1Cache L2_Replacement MMI [0x40cfc0, line 0x40cfc0] 2233866 3L1Cache Writeback_Ack MII [0x10bd40, line 0x10bd40] 2233866 3L1Cache Writeback_Ack MII [0x40cfc0, line 0x40cfc0] 2234458 3SeqDone [0x4033c3, line 0x4033c0] 3380 cycles 2234458 3L1Cache Exclusive_Data IMMM_W [0x4033c0, line 0x4033c0] 0Directory-0 2234458 3Seq Begin [0x40f883, line 0x40f880] ST 2234459 3L1Cache All_acks_no_sharers MM_WMM [0x4033c0, line 0x4033c0] 2234508 3L1Cache Exclusive_Data ISM_W[0x409ec0, line 0x409ec0] 0Directory-0 2234509 3L1Cache All_acks_no_sharersM_WM [0x409ec0, line 0x409ec0] 2234510 3L1CacheL1_to_L2 [0x4033c0, line 0x4033c0] 2234510 3L1CacheLoad IIS [0x407ec0, line 0x407ec0] 2234510 3L1Cache L2_Replacement MMI [0x40b4c0, line 0x40b4c0] 2234510 3L1CacheL1_to_L2 MM [0x409ec0, line 0x409ec0] 2234510 3L1CacheLoad IIS [0x100c40, line 0x100c40] 2234510 3L1Cache L2_Replacement MMMI [0x4033c0, line 0x4033c0] 2. Just for my own knowledge… Nate, you mentioned that handling the SIGABRT signal is the right way to make this feature work for all of M5. Why is that? Is it just the preference not to use macros that overwrite the meaning of assert, or is it something more fundamental? Thanks, Brad From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Tuesday, January 04, 2011 7:24 PM To: Steve Reinhardt; Ali Saidi; Gabe Black; Nathan Binkert Cc: Nilay Vaish; Default; Beckmann, Brad Subject: Re: Review Request: ruby: get rid of ruby's Debug.hh This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/367/ On January 4th, 2011, 4:31 p.m., Brad Beckmann wrote: Hi Nate, I have a couple questions: 1. Have you looked at the protocol trace output after your change? Does it look exactly like it did before? It seems that the output should be the same based on my brief inspection of your patch, but I would like to be sure about that. It may not be obvious, but there is a specific rational behind the format of the protocol trace and I want to make sure that stays the same. 2. With your patch applied, what happens if one hits an assert when running interactively? Previously, the process would abort allowing one to attach gdb and examine what is going on. I liked that feature and it would be great if we could maintain it. Could we port that feature to all of M5? On January 4th, 2011, 6:05 p.m., Nathan Binkert wrote: 1) I have not, because I don't know how, but I tried hard to make it exactly the same. Can you help me out? It won't look identical because DPRINTF prepends some stuff (curTick and object name) 2) we don't have a mechanism to have the process stall until GDB is attached, but given that this worked in Ruby only, I'd agree that this should be something that we do globally in M5. The right way to do this would be to handle SIGABRT and stall in the abort handler (I think that should work). Can we work on this patch and do that as a separate one? Brad, do you have some protocol trace with you? I have seen the trace that gets generated with the current trace facility using Ruby trace flag. It prints all the events for all the cache controllers and network routers. If you prefer, I can send you an example trace. Or you can generate one by running m5.opt with trace file and trace flag options supplied. ./build/ALPHA_SE_MESI_CMP_directory/m5.opt --trace-file=MESI.trace --trace-flags=Ruby ./configs/example/ruby_random_test.py -l 1000 - Nilay On January 4th, 2011, 3:02 p.m., Nathan Binkert wrote: Review request for Default, Ali Saidi, Gabe Black, Steve Reinhardt, and Nathan Binkert. By Nathan Binkert. Updated 2011-01-04 15:02:38 Description ruby: get rid of ruby's Debug.hh Get rid of the Debug class Get rid of ASSERT and use assert Use DPRINTF for ProtocolTrace Testing This compiles and passes all of the quick regressions, but it would be nice for a Ruby developer to take a look and see if I got rid of any useful functionality. Diffs * configs/ruby/Ruby.py (7338bc628489) * src/mem/SConscript
Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh
Is it possible to fix the width of the information prepended by DPRINTF? I would be great if we could maintain the current fixed width format. Brad -Original Message- From: bink...@gmail.com [mailto:bink...@gmail.com] On Behalf Of nathan binkert Sent: Wednesday, January 05, 2011 10:36 AM To: M5 Developer List Cc: Beckmann, Brad Subject: Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh Looks like we should just remove the first, second, and third columns that are spit out since they're covered almost exactly by the implicit columns added by DPRINTF. Right? Nate This how protocol trace would look like. I actually did not some such thing exists. I was currently relying on DPRINTF statements for checking the events that occurred. This is certainly easier to read and much more compact. 1395: system.l1_cntrl3: 1395 3 L1Cache L1_Replacement IMIM [0x15c0, line 0x15c0] 1395: system.l1_cntrl2: 1395 2 L1Cache L1_Replacement IMIM [0x3ac0, line 0x3ac0] 1395: system.l1_cntrl2: 1395 2 L1Cache L1_Replacement IMIM [0x3ac0, line 0x3ac0] 1395: system.l1_cntrl2: 1395 2 L1Cache L1_Replacement IMIM [0x3ac0, line 0x3ac0] 1396: system.l1_cntrl1: 1396 1 L1Cache L1_Replacement IMIM [0x2ac0, line 0x2ac0] 1397: system.ruby.cpu_ruby_ports2: 1397 2 Seq Done [0x3ae8, line 0x3ac0] 1386 cycles 1397: system.l1_cntrl2: 1397 2 L1Cache Data_all_Acks IMM [0x3ac0, line 0x3ac0] 1397: system.l1_cntrl3: 1397 3 L1Cache L1_Replacement IMIM [0x15c0, line 0x15c0] 1397: system.l2_cntrl0: 1397 0 L2CacheL2_Replacement_clean MT_MBMT_MB [0x3ac0, line 0x3ac0] 1400: system.l2_cntrl0: 1400 0 L2Cache L1_GETX MT_MBMT_MB [0x400, line 0x400] 1401: system.l2_cntrl0: 1401 0 L2CacheL2_Replacement_clean MT_MBMT_MB [0x3ac0, line 0x3ac0] 1402: system.l1_cntrl0: 1402 0 L1Cache L1_Replacement IMIM [0x4dc0, line 0x4dc0] 1402: system.l1_cntrl2: 1402 2 L1Cache L1_Replacement IMIM [0xdc0, line 0xdc0] 1402: system.l1_cntrl2: 1402 2 L1Cache L1_Replacement IMIM [0xdc0, line 0xdc0] On Wed, January 5, 2011 10:26 am, Beckmann, Brad wrote: 1. Below is a snip of a protocol trace that I recently used. I think it is important for us maintain that there is no DPRINTF information prepended to each line. The initial motivation for the protocol trace, was that tracing protocol transitions using standard debug print was too verbose. These traces can be 100’MB if not GBs in size, so reducing the information printed to each line is important. Nilay, could you send a snip of the trace with the patch applied? 2233850 3 L1Cache Load IIS [0x409ec0, line 0x409ec0] 2233850 3 L1Cache L2_Replacement MMI [0x40cfc0, line 0x40cfc0] 2233866 3 L1Cache Writeback_Ack MII [0x10bd40, line 0x10bd40] 2233866 3 L1Cache Writeback_Ack MII [0x40cfc0, line 0x40cfc0] 2234458 3 Seq Done [0x4033c3, line 0x4033c0] 3380 cycles 2234458 3 L1Cache Exclusive_Data IMMM_W [0x4033c0, line 0x4033c0] 0Directory-0 2234458 3 Seq Begin [0x40f883, line 0x40f880] ST 2234459 3 L1Cache All_acks_no_sharers MM_WMM [0x4033c0, line 0x4033c0] 2234508 3 L1Cache Exclusive_Data ISM_W [0x409ec0, line 0x409ec0] 0Directory-0 2234509 3 L1Cache All_acks_no_sharers M_WM [0x409ec0, line 0x409ec0] 2234510 3 L1Cache L1_to_L2 [0x4033c0, line 0x4033c0] 2234510 3 L1Cache Load IIS [0x407ec0, line 0x407ec0] 2234510 3 L1Cache L2_Replacement MMI [0x40b4c0, line 0x40b4c0] 2234510 3 L1Cache L1_to_L2 MM [0x409ec0, line 0x409ec0] 2234510 3 L1Cache Load IIS [0x100c40, line 0x100c40] 2234510 3 L1Cache L2_Replacement MMMI [0x4033c0, line 0x4033c0] 2. Just for my own knowledge… Nate, you mentioned that handling the SIGABRT signal is the right way to make this feature work for all of M5. Why is that? Is it just the preference not to use macros that overwrite the meaning of assert, or is it something more fundamental? Thanks, Brad From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Tuesday, January 04, 2011 7:24 PM To: Steve Reinhardt; Ali Saidi; Gabe Black; Nathan Binkert Cc: Nilay Vaish; Default; Beckmann, Brad Subject: Re: Review Request: ruby: get rid of ruby's Debug.hh This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/367/ On January 4th, 2011, 4:31 p.m., Brad Beckmann wrote: Hi Nate, I have a couple questions
Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh
So if we explicitly handled the SIGABRT signal, we would only want to do that if we are running interactively, correct? If so, then we would still have some sort of conditional similar, if not the same as, the current conditional in the assert macro if (isatty(STDIN_FILENO)). If my understanding is correct, then we would still have multiple behaviors for assert. One when the running interactively and another when running in batch mode. Am I missing something? I just want to make sure I understand why we don't want to just move the current Ruby ASSERT macro into src/base/debug.hh (or some other file in src/base). Thanks, Brad From: bink...@gmail.com [mailto:bink...@gmail.com] On Behalf Of nathan binkert Sent: Wednesday, January 05, 2011 10:40 AM To: Beckmann, Brad Cc: Nilay Vaish; Steve Reinhardt; Ali Saidi; Gabe Black; Default Subject: Re: Review Request: ruby: get rid of ruby's Debug.hh 2. Just for my own knowledge... Nate, you mentioned that handling the SIGABRT signal is the right way to make this feature work for all of M5. Why is that? Is it just the preference not to use macros that overwrite the meaning of assert, or is it something more fundamental? Not fundamental. Mostly because we don't want multiple meanings of assert. It seems that if we could get this to work properly that it would be easiest as well. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh
Yeah, that seems rather tedious. Let's just use DPRINTFN and maintain the current format. As long as the protocol trace format looks the same as before, I'm happy with the change. Brad -Original Message- From: bink...@gmail.com [mailto:bink...@gmail.com] On Behalf Of nathan binkert Sent: Wednesday, January 05, 2011 12:30 PM To: Beckmann, Brad Cc: M5 Developer List Subject: Re: [m5-dev] Review Request: ruby: get rid of ruby's Debug.hh Is it possible to fix the width of the information prepended by DPRINTF? I would be great if we could maintain the current fixed width format. That might be hard (and may argue for DPRINTFN). In practice, when I want that, I usually just ensure that my object names end up not varying in length. e.g. system0.cpu0.l1_foo0. If I have more than 10 things, I make the name cpu00 or something like that. Nate ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC
Hi Nilay, Lisa Hsu (another member of the lab here at AMD) and I were discussing these changes a bit more and there was one particular idea that came out of our conversation that I wanted to relay to you. Basically, we were thinking about how these changes will impact the flexibility of SLICC and we concluded that it is important to allow one to craft custom getCacheEntry functions for each protocol. I know initially I was hoping to generate these functions, but I now don’t think that is possible without restricting what protocols can be support by SLICC. Instead we can use these customized getCacheEntry functions to pass the cache entry to the actions via the trigger function. For those controllers that manage multiple cache memories, it is up to the programmer to understand what the cache entry pointer points to. That should eliminate the need to have multiple *cacheMemory_entry variables in the .sm files. Instead there is just the cache_entry variable that is set either by the trigger function call or set_cache_entry. Does that make sense to you? Brad From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Tuesday, January 04, 2011 9:43 AM To: Nilay Vaish; Default; Beckmann, Brad Subject: Re: Review Request: Changing how CacheMemory interfaces with SLICC This is an automatically generated e-mail. To reply, visit: http://reviews.m5sim.org/r/358/ On January 3rd, 2011, 3:31 p.m., Brad Beckmann wrote: Hi Nilay, First, I must say this is an impressive amount of work. You definitely got a lot done over holiday break. :) Overall, this set of patches is definitely close, but I want to see if we can take them a step forward. Also I have a few suggestions that may make things easier. Finally, I have a bunch of minor questions/suggestions on individual lines, but I’ll hold off on those until you can respond to my higher-level questions. The main thing I would like to see improved is not having to differentiate between “entry” and “entry_ptr” in the .sm files. Am I correct that the only functions in the .sm files that are passed an “entry_ptr” are “is_valid_ptr”, “getCacheEntry”, and “set_cache_entry”? If so, it seems that all three functions are generated with unique python code, either in an AST file or StateMachine.py. Therefore, could we just pass these functions “entry” and rely on the underneath python code to generate the correct references? This would make things more readable, “is_valid_ptr()” becomes “is_valid”, and it doesn’t require the slicc programmer to understand which functions take an entry pointer versus the entry itself. If we can’t make such a change, I worry about how much extra complexity this change pushes on the slicc programmer. Also another suggestion to make things more readable, please replace the name L1IcacheMemory_entry with L1I_entry. Do the same for L1D_entry and L2_entry. That will shorten many of your lines. So am I correct that hammer’s simultaneous usage of valid L1 and L2 cache entries in certain transitions is the only reason that within all actions, the getCacheEntry calls take multiple cache entries? If so, I think it would be fairly trivial to use a tbe entry as an intermediary between the L1 and L2 for those particular hammer transitions. That way only one cache entry is valid at any particular time, and we can simply use the variable cache_entry in the actions. That should clean things up a lot. By the way, once you check in these patches, the MESI_CMP_directory protocol will be deprecated, correct? If so, make sure you include a patch that removes it from the regression tester. Brad The main thing I would like to see improved is not having to differentiate between “entry†and “entry_ptr†in the .sm files. Am I correct that the only functions in the .sm files that are passed an “entry_ptr†are “is_valid_ptrâ€, “getCacheEntryâ€, and “set_cache_entryâ€? If so, it seems that all three functions are generated with unique python code, either in an AST file or StateMachine.py. Therefore, could we just pass these functions “entry†and rely on the underneath python code to generate the correct references? This would make things more readable, “is_valid_ptr()†becomes “is_validâ€, and it doesn’t require the slicc programmer to understand which functions take an entry pointer versus the entry itself. If we can’t make such a change, I worry about how much extra complexity this change pushes on the slicc programmer. There are functions that are passed cache entry and transaction buffer entry as arguments. Currently, I assume that these arguments are passed using pointers. Also another suggestion to make things more readable, please replace the name L1IcacheMemory_entry with L1I_entry. Do the same for L1D_entry and L2_entry. That will shorten many of your lines. The names of the cache entry variables
Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC
Hi Nilay, My responses are below: The main thing I would like to see improved is not having to differentiate between “entry†and “entry_ptr†in the .sm files. Am I correct that the only functions in the .sm files that are passed an “entry_ptr†are “is_valid_ptrâ€, “getCacheEntryâ€, and “set_cache_entryâ€? If so, it seems that all three functions are generated with unique python code, either in an AST file or StateMachine.py. Therefore, could we just pass these functions “entry†and rely on the underneath python code to generate the correct references? This would make things more readable, “is_valid_ptr()†becomes “is_validâ€, and it doesn’t require the slicc programmer to understand which functions take an entry pointer versus the entry itself. If we can’t make such a change, I worry about how much extra complexity this change pushes on the slicc programmer. There are functions that are passed cache entry and transaction buffer entry as arguments. Currently, I assume that these arguments are passed using pointers. [BB] So does that mean that the cache entry is always passed in as a pointer? If so, can one just use cache_entry for all function calls and remove any use of cache_entry_ptr in the .sm files? That is essentially what I would like to see. Also another suggestion to make things more readable, please replace the name L1IcacheMemory_entry with L1I_entry. Do the same for L1D_entry and L2_entry. That will shorten many of your lines. The names of the cache entry variables are currently tied with the names of the cache memory variables belonging to the machine. If the name of the cache memory variable is A, then the corresponding cache entry variable is named A_entry. [BB] Ah, I see. Ok then let's just keep them the way they are for now. We can deal with shorting the names later. So am I correct that hammer’s simultaneous usage of valid L1 and L2 cache entries in certain transitions is the only reason that within all actions, the getCacheEntry calls take multiple cache entries? If so, I think it would be fairly trivial to use a tbe entry as an intermediary between the L1 and L2 for those particular hammer transitions. That way only one cache entry is valid at any particular time, and we can simply use the variable cache_entry in the actions. That should clean things up a lot. Oops! Should have thought of that before doing all those changes. But can we assume that we would always have only one valid cache entry pointer at any given time? If that's true, I would probably revert to previous version of the patch. This should also resolve the naming issue. [BB] I wouldn't have expected you to realize that. It is one of those things that isn't completely obvious without spending a lot of time developing protocols. Yes, I think it is easiest for you to just revert to the previous version of the patch and just modify the hammer protocol to use a tbe entry as an intermediary. We've always had an unofficial rule that a controller can only manage multiple caches if those caches are exclusive with respect to each other. For the most part, that rule has been followed by all the protocols I'm familiar with. I think your change just makes that an official policy. By the way, once you check in these patches, the MESI_CMP_directory protocol will be deprecated, correct? If so, make sure you include a patch that removes it from the regression tester. I have a patch for the protocol, but I need to discuss it. Do you think it is possible that a protocol is not in a dead lock but random tester outputs so because the underlying memory system is taking too much time? The patch works for 1, 2, and 4 processors for 10,000,000 loads. I have tested these processor configurations with 40 different seed values. But for 8 processors, random tester outputs some thing like this -- panic: Possible Deadlock detected. Aborting! version: 6 request.paddr: 12779 m_writeRequestTable: 15 current time: 369500011 issue_time: 368993771 difference: 506240 @ cycle 369500011 [wakeup:build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc, line 123] [BB] Yes, the current version of MESI_CMP_directory is broken in many places. Arka just told me that he recently fixed many of those problems. I suggest getting his fixes and working from there. Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Review Request: Changing how CacheMemory interfaces with SLICC
Hi Nilay, At one point in time, the combination of several letters at the beginning of the action name corresponded to the short hand name for the action. The short hand name is the letter or letter combination that appears in the HTML tables. SLICC may have once enforced that the combination of letters matched the HTML short hand name, but I don't believe it does right now. Therefore the letters are just a guide to match actions with their associated short hand name. And yes, you can use any combination. Brad -Original Message- From: Nilay Vaish [mailto:ni...@cs.wisc.edu] Sent: Tuesday, January 04, 2011 12:03 PM To: Beckmann, Brad Cc: Default Subject: RE: Review Request: Changing how CacheMemory interfaces with SLICC Brad Is there a reason why each action name follows the pattern combination of several letters_action performed by the action? The letters used are not abbreviations of the action performed. Can we use any combination? Thanks Nilay On Tue, 4 Jan 2011, Beckmann, Brad wrote: Hi Nilay, My responses are below: The main thing I would like to see improved is not having to differentiate between “entry†and “entry_ptr†in the .sm files. Am I correct that the only functions in the .sm files that are passed an “entry_ptr†are “is_valid_ptrâ€, “getCacheEntryâ€, and “set_cache_entryâ€? If so, it seems that all three functions are generated with unique python code, either in an AST file or StateMachine.py. Therefore, could we just pass these functions “entry†and rely on the underneath python code to generate the correct references? This would make things more readable, “is_valid_ptr()†becomes “is_validâ€, and it doesn’t require the slicc programmer to understand which functions take an entry pointer versus the entry itself. If we can’t make such a change, I worry about how much extra complexity this change pushes on the slicc programmer. There are functions that are passed cache entry and transaction buffer entry as arguments. Currently, I assume that these arguments are passed using pointers. [BB] So does that mean that the cache entry is always passed in as a pointer? If so, can one just use cache_entry for all function calls and remove any use of cache_entry_ptr in the .sm files? That is essentially what I would like to see. Also another suggestion to make things more readable, please replace the name L1IcacheMemory_entry with L1I_entry. Do the same for L1D_entry and L2_entry. That will shorten many of your lines. The names of the cache entry variables are currently tied with the names of the cache memory variables belonging to the machine. If the name of the cache memory variable is A, then the corresponding cache entry variable is named A_entry. [BB] Ah, I see. Ok then let's just keep them the way they are for now. We can deal with shorting the names later. So am I correct that hammer’s simultaneous usage of valid L1 and L2 cache entries in certain transitions is the only reason that within all actions, the getCacheEntry calls take multiple cache entries? If so, I think it would be fairly trivial to use a tbe entry as an intermediary between the L1 and L2 for those particular hammer transitions. That way only one cache entry is valid at any particular time, and we can simply use the variable cache_entry in the actions. That should clean things up a lot. Oops! Should have thought of that before doing all those changes. But can we assume that we would always have only one valid cache entry pointer at any given time? If that's true, I would probably revert to previous version of the patch. This should also resolve the naming issue. [BB] I wouldn't have expected you to realize that. It is one of those things that isn't completely obvious without spending a lot of time developing protocols. Yes, I think it is easiest for you to just revert to the previous version of the patch and just modify the hammer protocol to use a tbe entry as an intermediary. We've always had an unofficial rule that a controller can only manage multiple caches if those caches are exclusive with respect to each other. For the most part, that rule has been followed by all the protocols I'm familiar with. I think your change just makes that an official policy. By the way, once you check in these patches, the MESI_CMP_directory protocol will be deprecated, correct? If so, make sure you include a patch that removes it from the regression tester. I have a patch for the protocol, but I need to discuss it. Do you think it is possible that a protocol is not in a dead lock but random tester outputs so because the underlying memory system is taking too much time? The patch works for 1, 2, and 4 processors for 10,000,000 loads. I have tested these processor configurations with 40 different seed values. But for 8 processors, random
Re: [m5-dev] Deadlock while running ruby_random_test.py
Hi Nilay, The following protocols (all of which are tested by the regression tester) should correctly work with the ruby random tester. MOESI_CMP_directory MOESI_hammer MOESI_token MI_example Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, December 21, 2010 6:24 PM To: M5 Developer List Subject: Re: [m5-dev] Deadlock while running ruby_random_test.py Brad, which protocols work correctly with ruby random tester? On Tue, 21 Dec 2010, Beckmann, Brad wrote: Hi Nilay, If I'm correctly reproducing your problem, I believe I know what the issue is. However, before I try to fix it, I want to propose simply getting rid of the MESI_CMP_directory. The more and more I look at that protocol, the more problems I see. There are several design and logic issues in the protocol. Unless someone wants to volunteer to fix them, I say we get rid of it as well as all of the protocols not being tested by the regression tester. Now the particular problem that I see causing the deadlock is that that L2 cache is drop a PUTX request from the L1 because the L2 is in SS_MB state. Thus the L1 remains in M_I state for infinity which of course will eventually lead to a deadlock. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, December 21, 2010 1:04 PM To: m5-dev@m5sim.org Subject: [m5-dev] Deadlock while running ruby_random_test.py I am running ALPHA_SE_MESI_CMP_directory with ruby_random_test.py. I supply the option -l as 2000. I have pasted the output below. This was generated using latest version of m5. Actually, while testing my own changes to SLICC and protocol files, I also observe the dead lock at the 301. So I ran the latest version and found even that gets stuck. Is this a known problem? Am I doing some thing wrong? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Deadlock while running ruby_random_test.py
Yep that is the beauty of the random tester. It is much easier to fix problems when you can reproduce them in 3 M cycles vs. 200 B. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Sent: Tuesday, December 21, 2010 8:00 PM To: M5 Developer List Subject: Re: [m5-dev] Deadlock while running ruby_random_test.py It is kind of surprising that random tester can detect the bug with in 3,000,000 cycles while nothing happened on running ruby_fs.py for 200,000,000,000 cycles. On Tue, December 21, 2010 8:24 pm, Nilay Vaish wrote: Brad, which protocols work correctly with ruby random tester? On Tue, 21 Dec 2010, Beckmann, Brad wrote: Hi Nilay, If I'm correctly reproducing your problem, I believe I know what the issue is. However, before I try to fix it, I want to propose simply getting rid of the MESI_CMP_directory. The more and more I look at that protocol, the more problems I see. There are several design and logic issues in the protocol. Unless someone wants to volunteer to fix them, I say we get rid of it as well as all of the protocols not being tested by the regression tester. Now the particular problem that I see causing the deadlock is that that L2 cache is drop a PUTX request from the L1 because the L2 is in SS_MB state. Thus the L1 remains in M_I state for infinity which of course will eventually lead to a deadlock. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, December 21, 2010 1:04 PM To: m5-dev@m5sim.org Subject: [m5-dev] Deadlock while running ruby_random_test.py I am running ALPHA_SE_MESI_CMP_directory with ruby_random_test.py. I supply the option -l as 2000. I have pasted the output below. This was generated using latest version of m5. Actually, while testing my own changes to SLICC and protocol files, I also observe the dead lock at the 301. So I ran the latest version and found even that gets stuck. Is this a known problem? Am I doing some thing wrong? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev -- Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Deadlock while running ruby_random_test.py
Hi Nilay, If I'm correctly reproducing your problem, I believe I know what the issue is. However, before I try to fix it, I want to propose simply getting rid of the MESI_CMP_directory. The more and more I look at that protocol, the more problems I see. There are several design and logic issues in the protocol. Unless someone wants to volunteer to fix them, I say we get rid of it as well as all of the protocols not being tested by the regression tester. Now the particular problem that I see causing the deadlock is that that L2 cache is drop a PUTX request from the L1 because the L2 is in SS_MB state. Thus the L1 remains in M_I state for infinity which of course will eventually lead to a deadlock. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Tuesday, December 21, 2010 1:04 PM To: m5-dev@m5sim.org Subject: [m5-dev] Deadlock while running ruby_random_test.py I am running ALPHA_SE_MESI_CMP_directory with ruby_random_test.py. I supply the option -l as 2000. I have pasted the output below. This was generated using latest version of m5. Actually, while testing my own changes to SLICC and protocol files, I also observe the dead lock at the 301. So I ran the latest version and found even that gets stuck. Is this a known problem? Am I doing some thing wrong? Thanks Nilay - M5 Simulator System Copyright (c) 2001-2008 The Regents of The University of Michigan All Rights Reserved M5 compiled Dec 21 2010 14:51:00 M5 revision 85e1847726e3 7798 default tip M5 started Dec 21 2010 14:52:30 M5 executing on scamorza.cs.wisc.edu command line: ./build/ALPHA_SE_MESI_CMP_directory/m5.debug ./configs/example/ruby_random_test.py -l 2000 Global frequency set at 10 ticks per second info: Entering event queue @ 0. Starting simulation... Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:102: Possible Deadlock detected Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:102: Possible Deadlock detected Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:103: m_version is 0 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:103: m_version is 0 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:104: request-ruby_request.paddr is 1092 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:104: request-ruby_request.paddr is 1092 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:105: m_readRequestTable.size() is 4 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:105: m_readRequestTable.size() is 4 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:106: current_time is 301 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:106: current_time is 301 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:107: request-issue_time is 2292161 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:107: request-issue_time is 2292161 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:108: current_time - request-issue_time is 707840 Warning: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:108: current_time - request-issue_time is 707840 Fatal Error: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:109: Aborting Fatal Error: in fn virtual void Sequencer::wakeup() in build/ALPHA_SE_MESI_CMP_directory/mem/ruby/system/Sequencer.cc:109: Aborting Program aborted at cycle 301 Aborted ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Implementation of findTagInSet
Hi Nilay, I apologize for the delay, but I was mostly travelling / in meetings last week and I didn't have a chance to review your patches and emails until this morning. Overall, your patches are definitely solid steps in the right direction and your profiling data sounds very promising. If you get the chance, please send it to me. I would be interested to know what are the top performance bottlenecks after your change. Before you spend time converting the other protocols, I do want to discuss the three points you brought up last week (see below). I have a bunch of free time over the next three days (Mon. - Wed.) and I do think a telephone conversation is best to discuss these details. Let me know what times work for you. Brad 1. Currently the implicit TBE and Cache Entry pointers are set to NULL in the calls to doTransition() function. To set these, we would need to make calls to a function that returns the pointer if the address is in the cache, NULL otherwise. I think we should retain the getEntry functions in the .sm files for in case of L1 cache both instruction and the data cache needs to be checked. This is something that I probably would prefer keeping out of SLICC. In fact, we should add getEntry functions for TBEs where ever required. These getEntry would now return a pointer instead of a reference. We would need to add support for return_by_pointer to SLICC. Also, since these functions would be used inside the Wakeup function, we would need to assume a common name for them across all protocols, just like getState() function. [BB] I would be very interested why you believe we should keep the getEntry functions out of SLICC. In my mind, this is one of the few functions that is very consistent across protocols. As I mentioned before, I really want to keep any notion of pointers out of the .sm files and avoid the changes you are proposing to getCacheEntry. We should probably discuss this in detail over-the-phone. 2. I still think we would need to change the changePermission function in the CacheMemory class. Presently it calls findTagInSet() twice. Instead, we would pass on the CacheEntry whose permissions need to be changed. This would save one call. We should also put the variable m_locked in the AbstractCacheEntry (may be make it part of the permission variable) to avoid the second call. [BB] I like moving the locked field to AbstractCacheEntry and removing the separate m_locked data structure. However, just a minor point, but we should avoid duplicating code in CacheMemory to support this change. Other than that, this looks good to me. 3. In the getState() and setState() functions, we need to specify that the function assumes that implicit TBE and CacheEntry pointers have been passed as arguments. How should we do this? I think we would need to push them in to the symbol table before they can be used in side the function. [BB] I'm a little confused by your current patch. It appears that you are proposing having two pairs of getState and setState functions. I would really like to avoid that and just have one pair of getState and setState functions. Also when I say implicitly pass the TBE and CacheEntry pointers, I mean that for the actions (similar to address). However, I think it is fine to explicitly pass these parameters into getState and setState (also similar to Address and State). ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] changeset in m5: ARM: Add checkpointing support
Hi Ali, I just synced with this changeset 7733, as well as changeset 7730, and I now notice that the modifications to physical.cc break all previous checkpoints. Can we put the lal_addr and lal_cid serialization and unserialization in a conditional that tests for the ARM ISA? I welcome other suggestions as well. In general, I would be interested to hear other people's thoughts on adding a checkpoint test to the regression tester. It would be great if we can at least identify ahead of time what changesets break older checkpoints. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Ali Saidi Sent: Monday, November 08, 2010 11:59 AM To: m5-dev@m5sim.org Subject: [m5-dev] changeset in m5: ARM: Add checkpointing support changeset 08d6a773d1b6 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=08d6a773d1b6 description: ARM: Add checkpointing support diffstat: src/arch/arm/isa.hh | 12 +- src/arch/arm/linux/system.cc | 5 +- src/arch/arm/linux/system.hh | 4 +- src/arch/arm/pagetable.hh| 87 +++ src/arch/arm/table_walker.cc | 16 ++- src/arch/arm/table_walker.hh | 2 +- src/arch/arm/tlb.cc | 14 ++- src/arch/arm/tlb.hh | 2 - src/dev/arm/gic.cc | 44 +- src/dev/arm/pl011.cc | 42 - src/dev/arm/rv_ctrl.cc | 2 - src/dev/arm/timer_sp804.cc | 59 - src/dev/arm/timer_sp804.hh | 4 ++ src/mem/physical.cc | 30 +++ src/mem/physical.hh | 5 ++ src/sim/SConscript | 1 + src/sim/system.cc| 2 +- src/sim/system.hh| 2 +- 18 files changed, 268 insertions(+), 65 deletions(-) diffs (truncated from 587 to 300 lines): diff -r a2c660de7787 -r 08d6a773d1b6 src/arch/arm/isa.hh --- a/src/arch/arm/isa.hh Mon Nov 08 13:58:24 2010 -0600 +++ b/src/arch/arm/isa.hh Mon Nov 08 13:58:25 2010 -0600 @@ -178,10 +178,18 @@ } void serialize(EventManager *em, std::ostream os) -{} +{ +DPRINTF(Checkpoint, Serializing Arm Misc Registers\n); +SERIALIZE_ARRAY(miscRegs, NumMiscRegs); +} void unserialize(EventManager *em, Checkpoint *cp, const std::string section) -{} +{ +DPRINTF(Checkpoint, Unserializing Arm Misc Registers\n); +UNSERIALIZE_ARRAY(miscRegs, NumMiscRegs); +CPSR tmp_cpsr = miscRegs[MISCREG_CPSR]; +updateRegMap(tmp_cpsr); +} ISA() { diff -r a2c660de7787 -r 08d6a773d1b6 src/arch/arm/linux/system.cc --- a/src/arch/arm/linux/system.ccMon Nov 08 13:58:24 2010 -0600 +++ b/src/arch/arm/linux/system.ccMon Nov 08 13:58:25 2010 -0600 @@ -99,9 +99,9 @@ } void -LinuxArmSystem::startup() +LinuxArmSystem::initState() { -ArmSystem::startup(); +ArmSystem::initState(); ThreadContext *tc = threadContexts[0]; // Set the initial PC to be at start of the kernel code @@ -117,7 +117,6 @@ { } - LinuxArmSystem * LinuxArmSystemParams::create() { diff -r a2c660de7787 -r 08d6a773d1b6 src/arch/arm/linux/system.hh --- a/src/arch/arm/linux/system.hhMon Nov 08 13:58:24 2010 -0600 +++ b/src/arch/arm/linux/system.hhMon Nov 08 13:58:25 2010 -0600 @@ -67,8 +67,8 @@ LinuxArmSystem(Params *p); ~LinuxArmSystem(); -/** Initialize the CPU for booting */ -void startup(); +void initState(); + private: #ifndef NDEBUG /** Event to halt the simulator if the kernel calls panic() */ diff -r a2c660de7787 -r 08d6a773d1b6 src/arch/arm/pagetable.hh --- a/src/arch/arm/pagetable.hh Mon Nov 08 13:58:24 2010 -0600 +++ b/src/arch/arm/pagetable.hh Mon Nov 08 13:58:25 2010 -0600 @@ -48,6 +48,8 @@ #include arch/arm/vtophys.hh #include config/full_system.hh +#include sim/serialize.hh + namespace ArmISA { struct VAddr @@ -71,39 +73,6 @@ }; -struct TlbRange -{ -Addr va; -Addr size; -int contextId; -bool global; - -inline bool -operator(const TlbRange r2) const -{ -if (!(global || r2.global)) { -if (contextId r2.contextId) -return true; -else if (contextId r2.contextId) -return false; -} - -if (va r2.va) -return true; -return false; -} - -inline bool -operator==(const TlbRange r2) const -{ -return va == r2.va - size == r2.size - contextId == r2.contextId - global == r2.global; -} -}; - - // ITB/DTB table entry struct TlbEntry { @@ -143,10 +112,8 @@ // Access permissions bool xn;
Re: [m5-dev] changeset in m5: Mem: Finish half-baked support for mmaping file...
Hi Ali, This is changset 7730 which also breaks all previous checkpoints because it requires phsymem to serialize and unserialize the variable _size. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Ali Saidi Sent: Monday, November 08, 2010 11:59 AM To: m5-dev@m5sim.org Subject: [m5-dev] changeset in m5: Mem: Finish half-baked support for mmaping file... changeset 982b4c6c1470 in /z/repo/m5 details: http://repo.m5sim.org/m5?cmd=changeset;node=982b4c6c1470 description: Mem: Finish half-baked support for mmaping file in physmem. Physmem has a parameter to be able to mem map a file, however it isn't actually used. This changeset utilizes the parameter so a file can be mmapped. diffstat: configs/common/FSConfig.py | 8 ++- src/mem/physical.cc| 48 +++-- src/mem/physical.hh| 8 +++--- 3 files changed, 44 insertions(+), 20 deletions(-) diffs (176 lines): diff -r d3c006ecccd3 -r 982b4c6c1470 configs/common/FSConfig.py --- a/configs/common/FSConfig.py Mon Nov 08 13:58:24 2010 -0600 +++ b/configs/common/FSConfig.py Mon Nov 08 13:58:24 2010 -0600 @@ -200,9 +200,12 @@ self.membus.badaddr_responder.warn_access = warn self.bridge = Bridge(delay='50ns', nack_delay='4ns') self.physmem = PhysicalMemory(range = AddrRange(mdesc.mem()), zero = True) +self.diskmem = PhysicalMemory(range = AddrRange(Addr('128MB'), size = '128MB'), + file = disk('ael-arm.ext2')) self.bridge.side_a = self.iobus.port self.bridge.side_b = self.membus.port self.physmem.port = self.membus.port +self.diskmem.port = self.membus.port self.mem_mode = mem_mode @@ -224,7 +227,10 @@ self.intrctrl = IntrControl() self.terminal = Terminal() -self.boot_osflags = 'earlyprintk mem=128MB console=ttyAMA0 lpj=19988480 norandmaps' +self.kernel = binary('vmlinux.arm') +self.boot_osflags = 'earlyprintk mem=128MB console=ttyAMA0 lpj=19988480' + \ +' norandmaps slram=slram0,0x800,+0x800' + \ +' mtdparts=slram0:- rw loglevel=8 root=/dev/mtdblock0' return self diff -r d3c006ecccd3 -r 982b4c6c1470 src/mem/physical.cc --- a/src/mem/physical.cc Mon Nov 08 13:58:24 2010 -0600 +++ b/src/mem/physical.cc Mon Nov 08 13:58:24 2010 -0600 @@ -31,6 +31,7 @@ #include sys/types.h #include sys/mman.h +#include sys/user.h #include errno.h #include fcntl.h #include unistd.h @@ -41,6 +42,7 @@ #include string #include arch/registers.hh +#include base/intmath.hh #include base/misc.hh #include base/random.hh #include base/types.hh @@ -56,26 +58,39 @@ PhysicalMemory::PhysicalMemory(const Params *p) : MemObject(p), pmemAddr(NULL), pagePtr(0), lat(p-latency), lat_var(p-latency_var), - cachedSize(params()-range.size()), cachedStart(params()- range.start) + _size(params()-range.size()), _start(params()-range.start) { -if (params()-range.size() % TheISA::PageBytes != 0) +if (size() % TheISA::PageBytes != 0) panic(Memory Size not divisible by page size\n); if (params()-null) return; -int map_flags = MAP_ANON | MAP_PRIVATE; -pmemAddr = (uint8_t *)mmap(NULL, params()-range.size(), - PROT_READ | PROT_WRITE, map_flags, -1, 0); + +if (params()-file == ) { +int map_flags = MAP_ANON | MAP_PRIVATE; +pmemAddr = (uint8_t *)mmap(NULL, size(), + PROT_READ | PROT_WRITE, map_flags, -1, 0); +} else { +int map_flags = MAP_PRIVATE; +int fd = open(params()-file.c_str(), O_RDONLY); +_size = lseek(fd, 0, SEEK_END); +lseek(fd, 0, SEEK_SET); +pmemAddr = (uint8_t *)mmap(NULL, roundUp(size(), PAGE_SIZE), + PROT_READ | PROT_WRITE, map_flags, fd, 0); +} if (pmemAddr == (void *)MAP_FAILED) { perror(mmap); -fatal(Could not mmap!\n); +if (params()-file == ) +fatal(Could not mmap!\n); +else +fatal(Could not find file: %s\n, params()-file); } //If requested, initialize all the memory to 0 if (p-zero) -memset(pmemAddr, 0, p-range.size()); +memset(pmemAddr, 0, size()); } void @@ -94,8 +109,7 @@ PhysicalMemory::~PhysicalMemory() { if (pmemAddr) -munmap((char*)pmemAddr, params()-range.size()); -//Remove memPorts? +munmap((char*)pmemAddr, size()); } Addr @@ -408,7 +422,7 @@ { snoop = false; resp.clear(); -resp.push_back(RangeSize(start(), params()-range.size())); +resp.push_back(RangeSize(start(), size())); } unsigned @@ -463,6 +477,7 @@ string
Re: [m5-dev] Implementation of findTagInSet
Hi Nilay, Yes, I believe a machine can be accessed within AST class functions, though I don't remember ever doing it myself. Look at the generate() function in TypeFieldEnumAST. Here you see that the machine (a.k.a StateMachine) is grabbed from the symbol table and then different StateMachine functions are called on it. You can imagine adding a new function to StateMachine.py that returns whether the TBETable exists. That seems like it should work to me, but let me know if it doesn't. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, December 09, 2010 5:24 PM To: M5 Developer List Subject: Re: [m5-dev] Implementation of findTagInSet Hi Brad Is there way to access the StateMachine object inside any of the AST class functions? I know the name of the machine can be accessed. But can the machine itself be accessed? I need one of the variables in the StateMachine object to know whether or not TBETable exists in this machine. Nilay On Wed, 8 Dec 2010, Beckmann, Brad wrote: Hi Nilay, I think we can avoid handling pointers in the getState and setState functions if we also add bool functions is_cache_entry_valid and is_tbe_entry_valid that are implicitly defined in SLICC. I don't think we should try to get rid of getState and setState since they often contain valuable, protocol-specific checks in them. Instead for getState and setState, I believe we should simply replace the current isTagPresent calls with the new is_*_valid calls. As far as changePermission() goes, your solution seems reasonable, but we may also want to consider just not changing that function at all. ChangePermission() doesn't actually use a cache entry within the .sm file, so is doesn't necessarily need to be changed. Going back to breaking this work into smaller portions, that is definitely a portion I feel can be pushed to the end or removed entirely. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Wednesday, December 08, 2010 11:53 AM To: M5 Developer List Subject: Re: [m5-dev] Implementation of findTagInSet Hi Brad, A couple of observations a. If we make use of pointers, would we not need to handle them in getState and setState functions? b. changePermission() seems to be a problem. It would still perform a lookup because the fact that a CacheEntry is a locked or not is maintained in the CacheMemory object and not with the entry itself. We can move that variable to be part of the AbstractCacheEntry or we can combine it with the permission variable which is already there in the AbstractCacheEntry class. I think lock is only used in the implementation of LL/SC instructions. Nilay On Wed, 8 Dec 2010, Beckmann, Brad wrote: Hi Nilay, Breaking the changes into small portions is a good idea, but we first need to decide exactly what we are doing. So far we've only thrown out some ideas. We have not yet to scope out a complete solution. I think we've settled on passing some sort of reference to the cache and tbe entries, but exactly whether that is by reference variables or pointers isn't clear. My initial preference is to use pointers in the generated code and set the pointers to NULL when a cache and/or tbe entry doesn't exist. However, one thing I really want to strive for is to keep pointer manipulation out of the .sm files. Writing SLICC code is hard enough and we don't want to burden the SLICC programmer with memory management as well. So how about this plan? - Lets remove all the getCacheEntry functions from the slicc files. I believe that almost all of these functions look exactly the same and it is easy enough for SLICC to just generate them instead. - Similarly let get rid of all isCacheTagPresent functions as well - Then lets replace all the getCacheEntry calls with an implicit SLICC supported variable called cache_entry and all the TBEs[addr*] calls with an implicit SLICC supported variable called tbe_entry. - Underneath these variables can actually be implemented as local inlined functions that assert whether the entries are valid and then return variables local to the state machine set to the current cache and tbe entry. - The trigger function will implicitly set these variables (pointers underneath) to NULL or valid values, and the only what they can be reset is through explicit functions set_cache_entry, reset_cache_entry, set_tbe_entry, and reset_tbe_entry. These function would be called by the appropriate actions or possibly be merged with the existing check_allocate function. I think that will give us what we want, but I realize I've just proposed changing 100's if not 1000's lines of SLICC code. I hope that these changes are straight forward, but any change like that is never