Re: [m5-dev] Store Buffer
In sum, I think we all agree that Ruby is going to handle *only non-speculative stores*. M5 CPU model(s) handles all of speculative and non-speculative stores that are *yet to be revealed to the memory sub-system*. To make it clearer, as I understand, we now have following: 1. All store buffering (speculative and non-speculative) is handled by CPU model in M5. 2. Ruby needs to forward intervention/invalidation received at L1 cache controller to the CPU model to let it take appropriate action to guarantee required memory consistency guarantees (e.g t may need to flush pipeline). OR CPU models need to check coherence permission at L1 cache at the commit time to know if intervening writes has happened or not (might be required to implement stricter model like SC). I think we need to provide one of the functionality from Ruby side to allow the second condition above. Which one to provide depends upon what M5 CPU models wants to do to guarantee consistency. Please let me know if you disagree or if I am missing something. Thanks Arka On 02/24/2011 05:22 PM, Beckmann, Brad wrote: So I think Steve and I are in agreement here. We both agree that both speculative and non-speculative store buffers should be on the CPU side of the RubyPort interface. I believe that was the same line that existed when Ruby tied to Opal in GEMS. I believe the non-speculative store buffer was only a feature used when Opal was not attached, and it was just the simple SimicsProcessor driving Ruby. The sequencer is a separate issue. Certain functionality of the sequencer can probably be eliminated in gem5, but I think other functionality needs to remain or at least be moved to some other part of Ruby. The sequencer performs a lot of protocol independent functionality including: updating the actual data block, performing synchronization with respect to the cache memory, translating m5 packets to ruby requests, checking for per-cacheblock deadlock, and coalescing requests to the same cache block. The coalescing functionality can probably be eliminated, but I think the other functionality needs to remain. Brad From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Thursday, February 24, 2011 1:52 PM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaishni...@cs.wisc.edumailto:ni...@cs.wisc.edu wrote: On Thu, 24 Feb 2011, Beckmann, Brad wrote: Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. I don't know the O3 LSQ model very well, but I assume it buffers both speculative and non-speculative stores. Are there two different structures in Ruby for that? I think the general issue here is that the dividing line between processor and memory system is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request filtering, etc. all happens in the processor. For example, I know I've had you explain this to me multiple times already, but I still don't understand why we still need Ruby sequencers either :-). Brad, I raise the same point that Arka raised earlier. Other processor models can also make use of store buffer. So, why only O3 should have a store buffer? Nilay, I think that's a different issue... we're not saying that other CPU models can't have store buffers, but in practice, the simple CPU models block on memory accesses so they don't need one. If the inorder model wants to add a store buffer (if it doesn't already have one), it would be an internal decision for them whether they want to write one from scratch or try to reuse the O3 code. There are already some shared structures in src/cpu like branch predictors that can be reused across CPU models. So in other words we need to decide first where the store buffer should live (CPU or memory system) and then we can worry about how to reuse that code if that's useful. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
This sounds right. Ruby does need to forward invalidations to the CPU since some models (including O3) will need to do internal invalidations/flushes to maintain consistency. Others can choose to do it other ways (e.g., by querying the L1 at commit as you suggest), but they have the option of ignoring the forwarded invalidations, so that's not a problem. Steve On Fri, Feb 25, 2011 at 9:07 AM, Arkaprava Basu aba...@wisc.edu wrote: In sum, I think we all agree that Ruby is going to handle *only non-speculative stores*. M5 CPU model(s) handles all of speculative and non-speculative stores that are *yet to be revealed to the memory sub-system*. To make it clearer, as I understand, we now have following: 1. All store buffering (speculative and non-speculative) is handled by CPU model in M5. 2. Ruby needs to forward intervention/invalidation received at L1 cache controller to the CPU model to let it take appropriate action to guarantee required memory consistency guarantees (e.g t may need to flush pipeline). OR CPU models need to check coherence permission at L1 cache at the commit time to know if intervening writes has happened or not (might be required to implement stricter model like SC). I think we need to provide one of the functionality from Ruby side to allow the second condition above. Which one to provide depends upon what M5 CPU models wants to do to guarantee consistency. Please let me know if you disagree or if I am missing something. Thanks Arka On 02/24/2011 05:22 PM, Beckmann, Brad wrote: So I think Steve and I are in agreement here. We both agree that both speculative and non-speculative store buffers should be on the CPU side of the RubyPort interface. I believe that was the same line that existed when Ruby tied to Opal in GEMS. I believe the non-speculative store buffer was only a feature used when Opal was not attached, and it was just the simple SimicsProcessor driving Ruby. The sequencer is a separate issue. Certain functionality of the sequencer can probably be eliminated in gem5, but I think other functionality needs to remain or at least be moved to some other part of Ruby. The sequencer performs a lot of protocol independent functionality including: updating the actual data block, performing synchronization with respect to the cache memory, translating m5 packets to ruby requests, checking for per-cacheblock deadlock, and coalescing requests to the same cache block. The coalescing functionality can probably be eliminated, but I think the other functionality needs to remain. Brad From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org m5-dev-boun...@m5sim.org ] On Behalf Of Steve Reinhardt Sent: Thursday, February 24, 2011 1:52 PM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish ni...@cs.wisc.edumailto:ni...@cs.wisc.edu ni...@cs.wisc.edu wrote: On Thu, 24 Feb 2011, Beckmann, Brad wrote: Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. I don't know the O3 LSQ model very well, but I assume it buffers both speculative and non-speculative stores. Are there two different structures in Ruby for that? I think the general issue here is that the dividing line between processor and memory system is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request filtering, etc. all happens in the processor. For example, I know I've had you explain this to me multiple times already, but I still don't understand why we still need Ruby sequencers either :-). Brad, I raise the same point that Arka raised earlier. Other processor models can also make use of store buffer. So, why only O3 should have a store buffer? Nilay, I think that's a different issue... we're not saying that other CPU models can't have store buffers, but in practice, the simple CPU models block on memory accesses so they don't need one. If the inorder model wants to add a store buffer (if it doesn't already have one), it would be an internal decision for them whether they want to write one from scratch or try to reuse the O3 code. There are already some shared structures in src/cpu like branch predictors that can be reused across CPU models. So in other words we need to decide
Re: [m5-dev] Store Buffer
It sounds like we are in agreement here, but I just want to make sure we clarify one item. I don't believe simply checking the coherence permissions at commit time can sufficiently support stronger consistency models like SC/TSO. Instead you need to really need to know whether you've ever lost the block since the speculative instruction read it. Therefore, Ruby really does need to forward invalidations to the CPU. It sounded like from your responses that you understand that as well, but I just wanted to make the point clear. Brad From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Friday, February 25, 2011 10:29 AM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer This sounds right. Ruby does need to forward invalidations to the CPU since some models (including O3) will need to do internal invalidations/flushes to maintain consistency. Others can choose to do it other ways (e.g., by querying the L1 at commit as you suggest), but they have the option of ignoring the forwarded invalidations, so that's not a problem. Steve On Fri, Feb 25, 2011 at 9:07 AM, Arkaprava Basu aba...@wisc.edumailto:aba...@wisc.edu wrote: In sum, I think we all agree that Ruby is going to handle *only non-speculative stores*. M5 CPU model(s) handles all of speculative and non-speculative stores that are *yet to be revealed to the memory sub-system*. To make it clearer, as I understand, we now have following: 1. All store buffering (speculative and non-speculative) is handled by CPU model in M5. 2. Ruby needs to forward intervention/invalidation received at L1 cache controller to the CPU model to let it take appropriate action to guarantee required memory consistency guarantees (e.g t may need to flush pipeline). OR CPU models need to check coherence permission at L1 cache at the commit time to know if intervening writes has happened or not (might be required to implement stricter model like SC). I think we need to provide one of the functionality from Ruby side to allow the second condition above. Which one to provide depends upon what M5 CPU models wants to do to guarantee consistency. Please let me know if you disagree or if I am missing something. Thanks Arka On 02/24/2011 05:22 PM, Beckmann, Brad wrote: So I think Steve and I are in agreement here. We both agree that both speculative and non-speculative store buffers should be on the CPU side of the RubyPort interface. I believe that was the same line that existed when Ruby tied to Opal in GEMS. I believe the non-speculative store buffer was only a feature used when Opal was not attached, and it was just the simple SimicsProcessor driving Ruby. The sequencer is a separate issue. Certain functionality of the sequencer can probably be eliminated in gem5, but I think other functionality needs to remain or at least be moved to some other part of Ruby. The sequencer performs a lot of protocol independent functionality including: updating the actual data block, performing synchronization with respect to the cache memory, translating m5 packets to ruby requests, checking for per-cacheblock deadlock, and coalescing requests to the same cache block. The coalescing functionality can probably be eliminated, but I think the other functionality needs to remain. Brad From: m5-dev-boun...@m5sim.orgmailto:m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org ] On Behalf Of Steve Reinhardt Sent: Thursday, February 24, 2011 1:52 PM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish ni...@cs.wisc.edumailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edu wrote: On Thu, 24 Feb 2011, Beckmann, Brad wrote: Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. I don't know the O3 LSQ model very well, but I assume it buffers both speculative and non-speculative stores. Are there two different structures in Ruby for that? I think the general issue here is that the dividing line between processor and memory system is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request filtering, etc. all happens in the processor. For example, I know I've had you explain this to me multiple times already, but I still don't understand
Re: [m5-dev] Store Buffer
On Thu, 24 Feb 2011, Arkaprava Basu wrote: Fundamentally, I wish to handle only non-speculative memory state within Ruby. Otherwise I think there might be risk of Ruby getting affected by the CPU model's behavior/nuances. As you suggested, Rubyport may well be the line dividing speculative and non-speculative state. I also agree that beyond RubyPort, all the stores should be non-speculative. I haven't looked at the Store buffer code in libruby and do not know how it interfaces with the protocols. So sorry, I don't have specific answers to your questions. I think Derek is the best person to comment on this as I believe he has used store buffer implementation for his prior research. I think currently the store buffer is not being used at all. I looked through GEMS code, and some of the protocols do declare a store buffer, but no one makes use of it. In gem5, store buffers are not included in the protocol files. In fact, current libruby code does nothing useful at all. I do think though, that the highest level (closest to the processor) cache controller (i.e. *-L1Cache.sm ) need to be made aware of the store buffer (unless it is hacked to bypass SLICC) . Thanks Arka -- Nilay On 02/23/2011 11:29 PM, Beckmann, Brad wrote: Sorry, I should have been more clear. It fundamentally comes down to how does the Ruby interface help support memory consistency, especially considering more realistic buffering between the CPU and memory system (both speculative and non-speculative). I'm pretty certain that Ruby and the RubyPort interface will need be changed. I just want us to fully understand the issues before making any changes or removing certain options. So are you advocating that the RubyPort interface be the line between speculative memory state and non-speculative memory state? As far as the current Ruby store buffer goes, how does it work with the L1 cache controller? For instance, if the L1 cache receives a probe/forwarded request to a block that exists in the non-speculative store buffer, what is the mechanism to retrieve the up-to-date data from the buffer entry? Is the mechanism protocol agnostic? Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
So we probably don't want to pass speculative store data to the RubyPort, but what about speculative load and store requests? I suspect we do want to send them to the RubyPort before the speculation is confirmed. That might require splitting stores to two separate transactions: the request and the actual data write. Also I suspect that the RubyPort will need to forward probes to the cpu models to allow the LSQ to maintain the proper consistency model. If those two things end up being true, then what is the benefit of putting the non-speculative store buffer in each protocol, versus just in the o3 cpu model? I'm not yet ready to advocate that is the right solution. I just want us to think these issues thru before deciding to go down one path or the other. Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Thursday, February 24, 2011 10:45 AM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer On Thu, 24 Feb 2011, Arkaprava Basu wrote: Fundamentally, I wish to handle only non-speculative memory state within Ruby. Otherwise I think there might be risk of Ruby getting affected by the CPU model's behavior/nuances. As you suggested, Rubyport may well be the line dividing speculative and non-speculative state. I also agree that beyond RubyPort, all the stores should be non-speculative. I haven't looked at the Store buffer code in libruby and do not know how it interfaces with the protocols. So sorry, I don't have specific answers to your questions. I think Derek is the best person to comment on this as I believe he has used store buffer implementation for his prior research. I think currently the store buffer is not being used at all. I looked through GEMS code, and some of the protocols do declare a store buffer, but no one makes use of it. In gem5, store buffers are not included in the protocol files. In fact, current libruby code does nothing useful at all. I do think though, that the highest level (closest to the processor) cache controller (i.e. *-L1Cache.sm ) need to be made aware of the store buffer (unless it is hacked to bypass SLICC) . Thanks Arka -- Nilay On 02/23/2011 11:29 PM, Beckmann, Brad wrote: Sorry, I should have been more clear. It fundamentally comes down to how does the Ruby interface help support memory consistency, especially considering more realistic buffering between the CPU and memory system (both speculative and non-speculative). I'm pretty certain that Ruby and the RubyPort interface will need be changed. I just want us to fully understand the issues before making any changes or removing certain options. So are you advocating that the RubyPort interface be the line between speculative memory state and non-speculative memory state? As far as the current Ruby store buffer goes, how does it work with the L1 cache controller? For instance, if the L1 cache receives a probe/forwarded request to a block that exists in the non-speculative store buffer, what is the mechanism to retrieve the up-to-date data from the buffer entry? Is the mechanism protocol agnostic? Brad ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
On Thu, Feb 24, 2011 at 11:08 AM, Beckmann, Brad brad.beckm...@amd.comwrote: So we probably don't want to pass speculative store data to the RubyPort, but what about speculative load and store requests? I suspect we do want to send them to the RubyPort before the speculation is confirmed. That might require splitting stores to two separate transactions: the request and the actual data write. Also I suspect that the RubyPort will need to forward probes to the cpu models to allow the LSQ to maintain the proper consistency model. If those two things end up being true, then what is the benefit of putting the non-speculative store buffer in each protocol, versus just in the o3 cpu model? I'm not yet ready to advocate that is the right solution. I just want us to think these issues thru before deciding to go down one path or the other. I also support the concept of thinking things through, but I'm also happy to comment without having done that yet :-). My gut instinct is to say that O3 already has an LSQ, so Ruby needs to send invalidations up to the core to support the consistency model, and if we do that there's no need for a store buffer in Ruby. I'd like to better understand the arguments against that approach. For example, why would we want to send stores to Ruby when they are still speculative? Do we have real examples of systems that send the store address to the L1 cache speculatively? If we want to fetch store data more aggressively, wouldn't it be equivalent to generate a prefetch-with-write-intent first, then generate the store itself only when it commits? I think there are machines that do that. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. Overall, I guess I'm concluding that we probably can delete the current Ruby store buffer. Do others agree? Brad From: Steve Reinhardt [mailto:ste...@gmail.com] Sent: Thursday, February 24, 2011 11:20 AM To: M5 Developer List Cc: Beckmann, Brad Subject: Re: [m5-dev] Store Buffer On Thu, Feb 24, 2011 at 11:08 AM, Beckmann, Brad brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote: So we probably don't want to pass speculative store data to the RubyPort, but what about speculative load and store requests? I suspect we do want to send them to the RubyPort before the speculation is confirmed. That might require splitting stores to two separate transactions: the request and the actual data write. Also I suspect that the RubyPort will need to forward probes to the cpu models to allow the LSQ to maintain the proper consistency model. If those two things end up being true, then what is the benefit of putting the non-speculative store buffer in each protocol, versus just in the o3 cpu model? I'm not yet ready to advocate that is the right solution. I just want us to think these issues thru before deciding to go down one path or the other. I also support the concept of thinking things through, but I'm also happy to comment without having done that yet :-). My gut instinct is to say that O3 already has an LSQ, so Ruby needs to send invalidations up to the core to support the consistency model, and if we do that there's no need for a store buffer in Ruby. I'd like to better understand the arguments against that approach. For example, why would we want to send stores to Ruby when they are still speculative? Do we have real examples of systems that send the store address to the L1 cache speculatively? If we want to fetch store data more aggressively, wouldn't it be equivalent to generate a prefetch-with-write-intent first, then generate the store itself only when it commits? I think there are machines that do that. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
On Thu, 24 Feb 2011, Beckmann, Brad wrote: Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. Brad, I raise the same point that Arka raised earlier. Other processor models can also make use of store buffer. So, why only O3 should have a store buffer? Secondly, does every protocol need to know the existence of a store buffer? Or can we let just the CacheMemory interface with the store buffer? Overall, I guess I'm concluding that we probably can delete the current Ruby store buffer. Do others agree? I agree. I had chat with Derek about this, he also agrees that it is not required in its present form. -- Nilay Brad From: Steve Reinhardt [mailto:ste...@gmail.com] Sent: Thursday, February 24, 2011 11:20 AM To: M5 Developer List Cc: Beckmann, Brad Subject: Re: [m5-dev] Store Buffer On Thu, Feb 24, 2011 at 11:08 AM, Beckmann, Brad brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote: So we probably don't want to pass speculative store data to the RubyPort, but what about speculative load and store requests? I suspect we do want to send them to the RubyPort before the speculation is confirmed. That might require splitting stores to two separate transactions: the request and the actual data write. Also I suspect that the RubyPort will need to forward probes to the cpu models to allow the LSQ to maintain the proper consistency model. If those two things end up being true, then what is the benefit of putting the non-speculative store buffer in each protocol, versus just in the o3 cpu model? I'm not yet ready to advocate that is the right solution. I just want us to think these issues thru before deciding to go down one path or the other. I also support the concept of thinking things through, but I'm also happy to comment without having done that yet :-). My gut instinct is to say that O3 already has an LSQ, so Ruby needs to send invalidations up to the core to support the consistency model, and if we do that there's no need for a store buffer in Ruby. I'd like to better understand the arguments against that approach. For example, why would we want to send stores to Ruby when they are still speculative? Do we have real examples of systems that send the store address to the L1 cache speculatively? If we want to fetch store data more aggressively, wouldn't it be equivalent to generate a prefetch-with-write-intent first, then generate the store itself only when it commits? I think there are machines that do that. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish ni...@cs.wisc.edu wrote: On Thu, 24 Feb 2011, Beckmann, Brad wrote: Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. I don't know the O3 LSQ model very well, but I assume it buffers both speculative and non-speculative stores. Are there two different structures in Ruby for that? I think the general issue here is that the dividing line between processor and memory system is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request filtering, etc. all happens in the processor. For example, I know I've had you explain this to me multiple times already, but I still don't understand why we still need Ruby sequencers either :-). Brad, I raise the same point that Arka raised earlier. Other processor models can also make use of store buffer. So, why only O3 should have a store buffer? Nilay, I think that's a different issue... we're not saying that other CPU models can't have store buffers, but in practice, the simple CPU models block on memory accesses so they don't need one. If the inorder model wants to add a store buffer (if it doesn't already have one), it would be an internal decision for them whether they want to write one from scratch or try to reuse the O3 code. There are already some shared structures in src/cpu like branch predictors that can be reused across CPU models. So in other words we need to decide first where the store buffer should live (CPU or memory system) and then we can worry about how to reuse that code if that's useful. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
So I think Steve and I are in agreement here. We both agree that both speculative and non-speculative store buffers should be on the CPU side of the RubyPort interface. I believe that was the same line that existed when Ruby tied to Opal in GEMS. I believe the non-speculative store buffer was only a feature used when Opal was not attached, and it was just the simple SimicsProcessor driving Ruby. The sequencer is a separate issue. Certain functionality of the sequencer can probably be eliminated in gem5, but I think other functionality needs to remain or at least be moved to some other part of Ruby. The sequencer performs a lot of protocol independent functionality including: updating the actual data block, performing synchronization with respect to the cache memory, translating m5 packets to ruby requests, checking for per-cacheblock deadlock, and coalescing requests to the same cache block. The coalescing functionality can probably be eliminated, but I think the other functionality needs to remain. Brad From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Steve Reinhardt Sent: Thursday, February 24, 2011 1:52 PM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish ni...@cs.wisc.edumailto:ni...@cs.wisc.edu wrote: On Thu, 24 Feb 2011, Beckmann, Brad wrote: Steve, I think we are in agreement here and we may just be disagreeing with the definition of speculative. From the Ruby perspective, I don't think it really matters...I don't think there is difference between a speculative store address request and a prefetch-with-write-intent. Also we agree that probes will need to be sent to O3 LSQ to support the consistency model. My point is that if we believe this functionality is required, what is the extra overhead of adding a non-speculative store buffer to the O3 model as well? I think that will be easier than trying to incorporate the current Ruby non-speculative store buffer into each protocol. I don't know the O3 LSQ model very well, but I assume it buffers both speculative and non-speculative stores. Are there two different structures in Ruby for that? I think the general issue here is that the dividing line between processor and memory system is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request filtering, etc. all happens in the processor. For example, I know I've had you explain this to me multiple times already, but I still don't understand why we still need Ruby sequencers either :-). Brad, I raise the same point that Arka raised earlier. Other processor models can also make use of store buffer. So, why only O3 should have a store buffer? Nilay, I think that's a different issue... we're not saying that other CPU models can't have store buffers, but in practice, the simple CPU models block on memory accesses so they don't need one. If the inorder model wants to add a store buffer (if it doesn't already have one), it would be an internal decision for them whether they want to write one from scratch or try to reuse the O3 code. There are already some shared structures in src/cpu like branch predictors that can be reused across CPU models. So in other words we need to decide first where the store buffer should live (CPU or memory system) and then we can worry about how to reuse that code if that's useful. Steve ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
That's a good question. Before we get rid of it, we should decide what is the interface between Ruby and the o3 LSQ. I don't know how the current o3 LSQ works, but I image that we need to pass probe requests through the RubyPort to make it work correctly. Does anyone with knowledge of the o3 LSQ have a suggestion? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Wednesday, February 23, 2011 4:51 PM To: m5-dev@m5sim.org Subject: [m5-dev] Store Buffer Brad, In case we remove libruby, what becomes of the store buffer? In fact, is store buffer in use? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
Hi Brad, I have very little knowledge about the store buffer implementation in libruby and o3 CPU model. But I have following high level question: Is this Store buffer in libruby only for keeping retired (non-speculative) stores ? If yes, then why a particular CPU models matters here? In-order cores (CPU models) can also use (retired) store buffer. I believe its is an issue of memory consistency model being simulated rather than tied to a particular CPU model (in-oder/out-of-order). Please correct me if I am totally missing something here. Thanks Arka On 02/23/2011 07:07 PM, Beckmann, Brad wrote: That's a good question. Before we get rid of it, we should decide what is the interface between Ruby and the o3 LSQ. I don't know how the current o3 LSQ works, but I image that we need to pass probe requests through the RubyPort to make it work correctly. Does anyone with knowledge of the o3 LSQ have a suggestion? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Wednesday, February 23, 2011 4:51 PM To: m5-dev@m5sim.org Subject: [m5-dev] Store Buffer Brad, In case we remove libruby, what becomes of the store buffer? In fact, is store buffer in use? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev
Re: [m5-dev] Store Buffer
Fundamentally, I wish to handle only non-speculative memory state within Ruby. Otherwise I think there might be risk of Ruby getting affected by the CPU model's behavior/nuances. As you suggested, Rubyport may well be the line dividing speculative and non-speculative state. I haven't looked at the Store buffer code in libruby and do not know how it interfaces with the protocols. So sorry, I don't have specific answers to your questions. I think Derek is the best person to comment on this as I believe he has used store buffer implementation for his prior research. I do think though, that the highest level (closest to the processor) cache controller (i.e. *-L1Cache.sm ) need to be made aware of the store buffer (unless it is hacked to bypass SLICC) . Thanks Arka On 02/23/2011 11:29 PM, Beckmann, Brad wrote: Sorry, I should have been more clear. It fundamentally comes down to how does the Ruby interface help support memory consistency, especially considering more realistic buffering between the CPU and memory system (both speculative and non-speculative). I'm pretty certain that Ruby and the RubyPort interface will need be changed. I just want us to fully understand the issues before making any changes or removing certain options. So are you advocating that the RubyPort interface be the line between speculative memory state and non-speculative memory state? As far as the current Ruby store buffer goes, how does it work with the L1 cache controller? For instance, if the L1 cache receives a probe/forwarded request to a block that exists in the non-speculative store buffer, what is the mechanism to retrieve the up-to-date data from the buffer entry? Is the mechanism protocol agnostic? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Arkaprava Basu Sent: Wednesday, February 23, 2011 6:10 PM To: M5 Developer List Subject: Re: [m5-dev] Store Buffer Hi Brad, I have very little knowledge about the store buffer implementation in libruby and o3 CPU model. But I have following high level question: Is this Store buffer in libruby only for keeping retired (non-speculative) stores ? If yes, then why a particular CPU models matters here? In-order cores (CPU models) can also use (retired) store buffer. I believe its is an issue of memory consistency model being simulated rather than tied to a particular CPU model (in-oder/out-of-order). Please correct me if I am totally missing something here. Thanks Arka On 02/23/2011 07:07 PM, Beckmann, Brad wrote: That's a good question. Before we get rid of it, we should decide what is the interface between Ruby and the o3 LSQ. I don't know how the current o3 LSQ works, but I image that we need to pass probe requests through the RubyPort to make it work correctly. Does anyone with knowledge of the o3 LSQ have a suggestion? Brad -Original Message- From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of Nilay Vaish Sent: Wednesday, February 23, 2011 4:51 PM To: m5-dev@m5sim.org Subject: [m5-dev] Store Buffer Brad, In case we remove libruby, what becomes of the store buffer? In fact, is store buffer in use? Thanks Nilay ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev ___ m5-dev mailing list m5-dev@m5sim.org http://m5sim.org/mailman/listinfo/m5-dev