Re: [m5-dev] Store Buffer

2011-02-25 Thread Arkaprava Basu
In sum, I think we all agree that Ruby is going to handle *only 
non-speculative stores*.  M5 CPU model(s) handles all of speculative and 
non-speculative stores that are *yet to be revealed to the memory 
sub-system*.


To make it clearer, as I understand,  we now have following:

1. All store buffering (speculative and non-speculative) is handled by 
CPU model in M5.
2. Ruby needs to forward intervention/invalidation received at L1 cache 
controller to the CPU model to let it take appropriate action to 
guarantee required memory consistency guarantees (e.g t may need to 
flush pipeline).

OR
CPU models need to check coherence permission at L1 cache at the 
commit time to know if intervening writes has happened or not (might be 
required to implement stricter model like SC).


I think we need to provide one of the functionality from Ruby side to 
allow the second condition above. Which one to provide depends upon what 
M5 CPU models wants to do to guarantee consistency.


Please let me know if you disagree or if I am missing something.

Thanks
Arka




On 02/24/2011 05:22 PM, Beckmann, Brad wrote:

So I think Steve and I are in agreement here.  We both agree that both 
speculative and non-speculative store buffers should be on the CPU side of the 
RubyPort interface.  I believe that was the same line that existed when Ruby 
tied to Opal in GEMS.  I believe the non-speculative store buffer was only a 
feature used when Opal was not attached, and it was just the simple 
SimicsProcessor driving Ruby.

The sequencer is a separate issue.  Certain functionality of the sequencer can 
probably be eliminated in gem5, but I think other functionality needs to remain 
or at least be moved to some other part of Ruby.  The sequencer performs a lot 
of protocol independent functionality including: updating the actual data 
block, performing synchronization with respect to the cache memory, translating 
m5 packets to ruby requests, checking for per-cacheblock deadlock, and 
coalescing requests to the same cache block.  The coalescing functionality can 
probably be eliminated, but I think the other functionality needs to remain.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Steve Reinhardt
Sent: Thursday, February 24, 2011 1:52 PM
To: M5 Developer List
Subject: Re: [m5-dev] Store Buffer


On Thu, Feb 24, 2011 at 1:32 PM, Nilay 
Vaishni...@cs.wisc.edumailto:ni...@cs.wisc.edu  wrote:
On Thu, 24 Feb 2011, Beckmann, Brad wrote:
Steve, I think we are in agreement here and we may just be disagreeing with the 
definition of speculative.  From the Ruby perspective, I don't think it really 
matters...I don't think there is difference between a speculative store address 
request and a prefetch-with-write-intent. Also we agree that probes will need 
to be sent to O3 LSQ to support the consistency model.
My point is that if we believe this functionality is required, what is the 
extra overhead of adding a non-speculative store buffer to the O3 model as 
well?  I think that will be easier than trying to incorporate the current Ruby 
non-speculative store buffer into each protocol.

I don't know the O3 LSQ model very well, but I assume it buffers both 
speculative and non-speculative stores.  Are there two different structures in 
Ruby for that?

I think the general issue here is that the dividing line between processor and memory 
system is different in M5 than it was with GEMS. with M5 assuming that write buffers, redundant request 
filtering, etc. all happens in the processor.  For example, I know I've had you explain this to 
me multiple times already, but I still don't understand why we still need Ruby sequencers either :-).

Brad, I raise the same point that Arka raised earlier. Other processor models 
can also make use of store buffer. So, why only O3 should have a store buffer?

Nilay, I think that's a different issue... we're not saying that other CPU 
models can't have store buffers, but in practice, the simple CPU models block 
on memory accesses so they don't need one.  If the inorder model wants to add a 
store buffer (if it doesn't already have one), it would be an internal decision 
for them whether they want to write one from scratch or try to reuse the O3 
code.  There are already some shared structures in src/cpu like branch 
predictors that can be reused across CPU models.

So in other words we need to decide first where the store buffer should live 
(CPU or memory system) and then we can worry about how to reuse that code if 
that's useful.
Steve



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-25 Thread Steve Reinhardt
This sounds right.  Ruby does need to forward invalidations to the CPU since
some models (including O3) will need to do internal invalidations/flushes to
maintain consistency.  Others can choose to do it other ways (e.g., by
querying the L1 at commit as you suggest), but they have the option of
ignoring the forwarded invalidations, so that's not a problem.

Steve

On Fri, Feb 25, 2011 at 9:07 AM, Arkaprava Basu aba...@wisc.edu wrote:

  In sum, I think we all agree that Ruby is going to handle *only
 non-speculative stores*.  M5 CPU model(s) handles all of speculative and
 non-speculative stores that are *yet to be revealed to the memory
 sub-system*.

 To make it clearer, as I understand,  we now have following:

 1. All store buffering (speculative and non-speculative) is handled by CPU
 model in M5.
 2. Ruby needs to forward intervention/invalidation received at L1 cache
 controller to the CPU model to let it take appropriate action to guarantee
 required memory consistency guarantees (e.g t may need to flush pipeline).
 OR
 CPU models need to check coherence permission at L1 cache at the commit
 time to know if intervening writes has happened or not (might be required to
 implement stricter model like SC).

 I think we need to provide one of the functionality from Ruby side to allow
 the second condition above. Which one to provide depends upon what M5 CPU
 models wants to do to guarantee consistency.

 Please let me know if you disagree or if I am missing something.

 Thanks
 Arka





 On 02/24/2011 05:22 PM, Beckmann, Brad wrote:

 So I think Steve and I are in agreement here.  We both agree that both 
 speculative and non-speculative store buffers should be on the CPU side of 
 the RubyPort interface.  I believe that was the same line that existed when 
 Ruby tied to Opal in GEMS.  I believe the non-speculative store buffer was 
 only a feature used when Opal was not attached, and it was just the simple 
 SimicsProcessor driving Ruby.

 The sequencer is a separate issue.  Certain functionality of the sequencer 
 can probably be eliminated in gem5, but I think other functionality needs to 
 remain or at least be moved to some other part of Ruby.  The sequencer 
 performs a lot of protocol independent functionality including: updating the 
 actual data block, performing synchronization with respect to the cache 
 memory, translating m5 packets to ruby requests, checking for per-cacheblock 
 deadlock, and coalescing requests to the same cache block.  The coalescing 
 functionality can probably be eliminated, but I think the other functionality 
 needs to remain.

 Brad


 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org 
 m5-dev-boun...@m5sim.org
 ] On Behalf Of Steve Reinhardt
 Sent: Thursday, February 24, 2011 1:52 PM
 To: M5 Developer List
 Subject: Re: [m5-dev] Store Buffer


 On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish 
 ni...@cs.wisc.edumailto:ni...@cs.wisc.edu ni...@cs.wisc.edu wrote:
 On Thu, 24 Feb 2011, Beckmann, Brad wrote:
 Steve, I think we are in agreement here and we may just be disagreeing with 
 the definition of speculative.  From the Ruby perspective, I don't think it 
 really matters...I don't think there is difference between a speculative 
 store address request and a prefetch-with-write-intent. Also we agree that 
 probes will need to be sent to O3 LSQ to support the consistency model.
 My point is that if we believe this functionality is required, what is the 
 extra overhead of adding a non-speculative store buffer to the O3 model as 
 well?  I think that will be easier than trying to incorporate the current 
 Ruby non-speculative store buffer into each protocol.

 I don't know the O3 LSQ model very well, but I assume it buffers both 
 speculative and non-speculative stores.  Are there two different structures 
 in Ruby for that?

 I think the general issue here is that the dividing line between processor 
 and memory system is different in M5 than it was with GEMS. with M5 
 assuming that write buffers, redundant request filtering, etc. all happens in 
 the processor.  For example, I know I've had you explain this to me 
 multiple times already, but I still don't understand why we still need Ruby 
 sequencers either :-).

 Brad, I raise the same point that Arka raised earlier. Other processor models 
 can also make use of store buffer. So, why only O3 should have a store buffer?

 Nilay, I think that's a different issue... we're not saying that other CPU 
 models can't have store buffers, but in practice, the simple CPU models block 
 on memory accesses so they don't need one.  If the inorder model wants to add 
 a store buffer (if it doesn't already have one), it would be an internal 
 decision for them whether they want to write one from scratch or try to reuse 
 the O3 code.  There are already some shared structures in src/cpu like branch 
 predictors that can be reused across CPU models.

 So in other words we need to decide

Re: [m5-dev] Store Buffer

2011-02-25 Thread Beckmann, Brad
It sounds like we are in agreement here, but I just want to make sure we 
clarify one item.  I don't believe simply checking the coherence permissions at 
commit time can sufficiently support stronger consistency models like SC/TSO.  
Instead you need to really need to know whether you've ever lost the block 
since the speculative instruction read it.  Therefore, Ruby really does need to 
forward invalidations to the CPU.

It sounded like from your responses that you understand that as well, but I 
just wanted to make the point clear.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Steve Reinhardt
Sent: Friday, February 25, 2011 10:29 AM
To: M5 Developer List
Subject: Re: [m5-dev] Store Buffer

This sounds right.  Ruby does need to forward invalidations to the CPU since 
some models (including O3) will need to do internal invalidations/flushes to 
maintain consistency.  Others can choose to do it other ways (e.g., by querying 
the L1 at commit as you suggest), but they have the option of ignoring the 
forwarded invalidations, so that's not a problem.

Steve
On Fri, Feb 25, 2011 at 9:07 AM, Arkaprava Basu 
aba...@wisc.edumailto:aba...@wisc.edu wrote:
In sum, I think we all agree that Ruby is going to handle *only non-speculative 
stores*.  M5 CPU model(s) handles all of speculative and non-speculative stores 
that are *yet to be revealed to the memory sub-system*.

To make it clearer, as I understand,  we now have following:

1. All store buffering (speculative and non-speculative) is handled by CPU 
model in M5.
2. Ruby needs to forward intervention/invalidation received at L1 cache 
controller to the CPU model to let it take appropriate action to guarantee 
required memory consistency guarantees (e.g t may need to flush pipeline).
OR
CPU models need to check coherence permission at L1 cache at the commit 
time to know if intervening writes has happened or not (might be required to 
implement stricter model like SC).

I think we need to provide one of the functionality from Ruby side to allow the 
second condition above. Which one to provide depends upon what M5 CPU models 
wants to do to guarantee consistency.

Please let me know if you disagree or if I am missing something.

Thanks
Arka





On 02/24/2011 05:22 PM, Beckmann, Brad wrote:

So I think Steve and I are in agreement here.  We both agree that both 
speculative and non-speculative store buffers should be on the CPU side of the 
RubyPort interface.  I believe that was the same line that existed when Ruby 
tied to Opal in GEMS.  I believe the non-speculative store buffer was only a 
feature used when Opal was not attached, and it was just the simple 
SimicsProcessor driving Ruby.



The sequencer is a separate issue.  Certain functionality of the sequencer can 
probably be eliminated in gem5, but I think other functionality needs to remain 
or at least be moved to some other part of Ruby.  The sequencer performs a lot 
of protocol independent functionality including: updating the actual data 
block, performing synchronization with respect to the cache memory, translating 
m5 packets to ruby requests, checking for per-cacheblock deadlock, and 
coalescing requests to the same cache block.  The coalescing functionality can 
probably be eliminated, but I think the other functionality needs to remain.



Brad





From: m5-dev-boun...@m5sim.orgmailto:m5-dev-boun...@m5sim.org 
[mailto:m5-dev-boun...@m5sim.org

] On Behalf Of Steve Reinhardt

Sent: Thursday, February 24, 2011 1:52 PM

To: M5 Developer List

Subject: Re: [m5-dev] Store Buffer





On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish 
ni...@cs.wisc.edumailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edumailto:ni...@cs.wisc.edu
 wrote:

On Thu, 24 Feb 2011, Beckmann, Brad wrote:

Steve, I think we are in agreement here and we may just be disagreeing with the 
definition of speculative.  From the Ruby perspective, I don't think it really 
matters...I don't think there is difference between a speculative store address 
request and a prefetch-with-write-intent. Also we agree that probes will need 
to be sent to O3 LSQ to support the consistency model.

My point is that if we believe this functionality is required, what is the 
extra overhead of adding a non-speculative store buffer to the O3 model as 
well?  I think that will be easier than trying to incorporate the current Ruby 
non-speculative store buffer into each protocol.



I don't know the O3 LSQ model very well, but I assume it buffers both 
speculative and non-speculative stores.  Are there two different structures in 
Ruby for that?



I think the general issue here is that the dividing line between processor 
and memory system is different in M5 than it was with GEMS. with M5 assuming 
that write buffers, redundant request filtering, etc. all happens in the 
processor.  For example, I know I've had you explain this to me multiple 
times already, but I still don't understand

Re: [m5-dev] Store Buffer

2011-02-24 Thread Nilay Vaish

On Thu, 24 Feb 2011, Arkaprava Basu wrote:

Fundamentally, I wish to handle only non-speculative memory state within 
Ruby. Otherwise I think there might be risk of Ruby getting affected by the 
CPU model's behavior/nuances. As you suggested, Rubyport may well be the line 
dividing speculative and non-speculative state.


I also agree that beyond RubyPort, all the stores should be 
non-speculative.


I haven't looked at the  Store buffer code in libruby and do not know how it 
interfaces with the protocols. So sorry, I don't have specific answers to 
your questions. I think Derek is the best person to comment on this as I 
believe he has used store buffer implementation for his prior research.


I think currently the store buffer is not being used at all. I looked 
through GEMS code, and some of the protocols do declare a store buffer, 
but no one makes use of it. In gem5, store buffers are not included 
in the protocol files. In fact, current libruby code does nothing useful 
at all.


I do think though, that the highest level (closest to the processor) cache 
controller (i.e. *-L1Cache.sm ) need to be made aware of the store buffer 
(unless it is hacked to bypass SLICC) .


Thanks
Arka



--
Nilay


On 02/23/2011 11:29 PM, Beckmann, Brad wrote:
Sorry, I should have been more clear.  It fundamentally comes down to how 
does the Ruby interface help support memory consistency, especially 
considering more realistic buffering between the CPU and memory system 
(both speculative and non-speculative).  I'm pretty certain that Ruby and 
the RubyPort interface will need be changed.  I just want us to fully 
understand the issues before making any changes or removing certain 
options.  So are you advocating that the RubyPort interface be the line 
between speculative memory state and non-speculative memory state?


As far as the current Ruby store buffer goes, how does it work with the L1 
cache controller?  For instance, if the L1 cache receives a probe/forwarded 
request to a block that exists in the non-speculative store buffer, what is 
the mechanism to retrieve the up-to-date data from the buffer entry?  Is 
the mechanism protocol agnostic?


Brad



___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-24 Thread Beckmann, Brad
So we probably don't want to pass speculative store data to the RubyPort, but 
what about speculative load and store requests?  I suspect we do want to send 
them to the RubyPort before the speculation is confirmed.  That might require 
splitting stores to two separate transactions: the request and the actual data 
write.  Also I suspect that the RubyPort will need to forward probes to the cpu 
models to allow the LSQ to maintain the proper consistency model.  If those two 
things end up being true, then what is the benefit of putting the 
non-speculative store buffer in each protocol, versus just in the o3 cpu model?

I'm not yet ready to advocate that is the right solution.  I just want us to 
think these issues thru before deciding to go down one path or the other.

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
 Behalf Of Nilay Vaish
 Sent: Thursday, February 24, 2011 10:45 AM
 To: M5 Developer List
 Subject: Re: [m5-dev] Store Buffer
 
 On Thu, 24 Feb 2011, Arkaprava Basu wrote:
 
  Fundamentally, I wish to handle only non-speculative memory state
 within
  Ruby. Otherwise I think there might be risk of Ruby getting affected
 by the
  CPU model's behavior/nuances. As you suggested, Rubyport may well be
 the line
  dividing speculative and non-speculative state.
 
 I also agree that beyond RubyPort, all the stores should be
 non-speculative.
 
  I haven't looked at the  Store buffer code in libruby and do not know
 how it
  interfaces with the protocols. So sorry, I don't have specific
 answers to
  your questions. I think Derek is the best person to comment on this
 as I
  believe he has used store buffer implementation for his prior
 research.
 
 I think currently the store buffer is not being used at all. I looked
 through GEMS code, and some of the protocols do declare a store buffer,
 but no one makes use of it. In gem5, store buffers are not included
 in the protocol files. In fact, current libruby code does nothing
 useful
 at all.
 
  I do think though, that the highest level (closest to the processor)
 cache
  controller (i.e. *-L1Cache.sm ) need to be made aware of the store
 buffer
  (unless it is hacked to bypass SLICC) .
 
  Thanks
  Arka
 
 
 --
 Nilay
 
  On 02/23/2011 11:29 PM, Beckmann, Brad wrote:
  Sorry, I should have been more clear.  It fundamentally comes down
 to how
  does the Ruby interface help support memory consistency, especially
  considering more realistic buffering between the CPU and memory
 system
  (both speculative and non-speculative).  I'm pretty certain that
 Ruby and
  the RubyPort interface will need be changed.  I just want us to
 fully
  understand the issues before making any changes or removing certain
  options.  So are you advocating that the RubyPort interface be the
 line
  between speculative memory state and non-speculative memory state?
 
  As far as the current Ruby store buffer goes, how does it work with
 the L1
  cache controller?  For instance, if the L1 cache receives a
 probe/forwarded
  request to a block that exists in the non-speculative store buffer,
 what is
  the mechanism to retrieve the up-to-date data from the buffer entry?
 Is
  the mechanism protocol agnostic?
 
  Brad
 
 
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-24 Thread Steve Reinhardt
On Thu, Feb 24, 2011 at 11:08 AM, Beckmann, Brad brad.beckm...@amd.comwrote:

 So we probably don't want to pass speculative store data to the RubyPort,
 but what about speculative load and store requests?  I suspect we do want to
 send them to the RubyPort before the speculation is confirmed.  That might
 require splitting stores to two separate transactions: the request and the
 actual data write.  Also I suspect that the RubyPort will need to forward
 probes to the cpu models to allow the LSQ to maintain the proper consistency
 model.  If those two things end up being true, then what is the benefit of
 putting the non-speculative store buffer in each protocol, versus just in
 the o3 cpu model?

 I'm not yet ready to advocate that is the right solution.  I just want us
 to think these issues thru before deciding to go down one path or the other.


I also support the concept of thinking things through, but I'm also happy to
comment without having done that yet :-).

My gut instinct is to say that O3 already has an LSQ, so Ruby needs to send
invalidations up to the core to support the consistency model, and if we do
that there's no need for a store buffer in Ruby. I'd like to better
understand the arguments against that approach.  For example, why would we
want to send stores to Ruby when they are still speculative?  Do we have
real examples of systems that send the store address to the L1 cache
speculatively?  If we want to fetch store data more aggressively, wouldn't
it be equivalent to generate a prefetch-with-write-intent first, then
generate the store itself only when it commits?  I think there are machines
that do that.

Steve
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-24 Thread Beckmann, Brad
Steve, I think we are in agreement here and we may just be disagreeing with the 
definition of speculative.  From the Ruby perspective, I don't think it really 
matters...I don't think there is difference between a speculative store address 
request and a prefetch-with-write-intent.  Also we agree that probes will need 
to be sent to O3 LSQ to support the consistency model.

My point is that if we believe this functionality is required, what is the 
extra overhead of adding a non-speculative store buffer to the O3 model as 
well?  I think that will be easier than trying to incorporate the current Ruby 
non-speculative store buffer into each protocol.

Overall, I guess I'm concluding that we probably can delete the current Ruby 
store buffer.  Do others agree?

Brad


From: Steve Reinhardt [mailto:ste...@gmail.com]
Sent: Thursday, February 24, 2011 11:20 AM
To: M5 Developer List
Cc: Beckmann, Brad
Subject: Re: [m5-dev] Store Buffer


On Thu, Feb 24, 2011 at 11:08 AM, Beckmann, Brad 
brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote:
So we probably don't want to pass speculative store data to the RubyPort, but 
what about speculative load and store requests?  I suspect we do want to send 
them to the RubyPort before the speculation is confirmed.  That might require 
splitting stores to two separate transactions: the request and the actual data 
write.  Also I suspect that the RubyPort will need to forward probes to the cpu 
models to allow the LSQ to maintain the proper consistency model.  If those two 
things end up being true, then what is the benefit of putting the 
non-speculative store buffer in each protocol, versus just in the o3 cpu model?

I'm not yet ready to advocate that is the right solution.  I just want us to 
think these issues thru before deciding to go down one path or the other.

I also support the concept of thinking things through, but I'm also happy to 
comment without having done that yet :-).

My gut instinct is to say that O3 already has an LSQ, so Ruby needs to send 
invalidations up to the core to support the consistency model, and if we do 
that there's no need for a store buffer in Ruby. I'd like to better understand 
the arguments against that approach.  For example, why would we want to send 
stores to Ruby when they are still speculative?  Do we have real examples of 
systems that send the store address to the L1 cache speculatively?  If we want 
to fetch store data more aggressively, wouldn't it be equivalent to generate a 
prefetch-with-write-intent first, then generate the store itself only when it 
commits?  I think there are machines that do that.

Steve
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-24 Thread Nilay Vaish

On Thu, 24 Feb 2011, Beckmann, Brad wrote:

Steve, I think we are in agreement here and we may just be disagreeing 
with the definition of speculative.  From the Ruby perspective, I don't 
think it really matters...I don't think there is difference between a 
speculative store address request and a prefetch-with-write-intent. 
Also we agree that probes will need to be sent to O3 LSQ to support the 
consistency model.
My point is that if we believe this functionality is required, what is 
the extra overhead of adding a non-speculative store buffer to the O3 
model as well?  I think that will be easier than trying to incorporate 
the current Ruby non-speculative store buffer into each protocol.


Brad, I raise the same point that Arka raised earlier. Other processor 
models can also make use of store buffer. So, why only O3 should have a 
store buffer?


Secondly, does every protocol need to know the existence of a store 
buffer? Or can we let just the CacheMemory interface with the store 
buffer?




Overall, I guess I'm concluding that we probably can delete the current
Ruby store buffer.  Do others agree?


I agree. I had chat with Derek about this, he also agrees that it is not 
required in its present form.


--
Nilay



Brad


From: Steve Reinhardt [mailto:ste...@gmail.com]
Sent: Thursday, February 24, 2011 11:20 AM
To: M5 Developer List
Cc: Beckmann, Brad
Subject: Re: [m5-dev] Store Buffer


On Thu, Feb 24, 2011 at 11:08 AM, Beckmann, Brad 
brad.beckm...@amd.commailto:brad.beckm...@amd.com wrote: So we 
probably don't want to pass speculative store data to the RubyPort, but 
what about speculative load and store requests?  I suspect we do want to 
send them to the RubyPort before the speculation is confirmed.  That 
might require splitting stores to two separate transactions: the request 
and the actual data write.  Also I suspect that the RubyPort will need 
to forward probes to the cpu models to allow the LSQ to maintain the 
proper consistency model.  If those two things end up being true, then 
what is the benefit of putting the non-speculative store buffer in each 
protocol, versus just in the o3 cpu model?


I'm not yet ready to advocate that is the right solution.  I just want 
us to think these issues thru before deciding to go down one path or the 
other.


I also support the concept of thinking things through, but I'm also 
happy to comment without having done that yet :-).


My gut instinct is to say that O3 already has an LSQ, so Ruby needs to 
send invalidations up to the core to support the consistency model, and 
if we do that there's no need for a store buffer in Ruby. I'd like to 
better understand the arguments against that approach.  For example, why 
would we want to send stores to Ruby when they are still speculative? 
Do we have real examples of systems that send the store address to the 
L1 cache speculatively?  If we want to fetch store data more 
aggressively, wouldn't it be equivalent to generate a 
prefetch-with-write-intent first, then generate the store itself only 
when it commits?  I think there are machines that do that.


Steve


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-24 Thread Steve Reinhardt
On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish ni...@cs.wisc.edu wrote:

 On Thu, 24 Feb 2011, Beckmann, Brad wrote:

  Steve, I think we are in agreement here and we may just be disagreeing
 with the definition of speculative.  From the Ruby perspective, I don't
 think it really matters...I don't think there is difference between a
 speculative store address request and a prefetch-with-write-intent. Also we
 agree that probes will need to be sent to O3 LSQ to support the consistency
 model.
 My point is that if we believe this functionality is required, what is the
 extra overhead of adding a non-speculative store buffer to the O3 model as
 well?  I think that will be easier than trying to incorporate the current
 Ruby non-speculative store buffer into each protocol.


I don't know the O3 LSQ model very well, but I assume it buffers both
speculative and non-speculative stores.  Are there two different structures
in Ruby for that?

I think the general issue here is that the dividing line between processor
and memory system is different in M5 than it was with GEMS. with M5
assuming that write buffers, redundant request filtering, etc. all happens
in the processor.  For example, I know I've had you explain this to me
multiple times already, but I still don't understand why we still need Ruby
sequencers either :-).


  Brad, I raise the same point that Arka raised earlier. Other processor
 models can also make use of store buffer. So, why only O3 should have a
 store buffer?


Nilay, I think that's a different issue... we're not saying that other CPU
models can't have store buffers, but in practice, the simple CPU models
block on memory accesses so they don't need one.  If the inorder model wants
to add a store buffer (if it doesn't already have one), it would be an
internal decision for them whether they want to write one from scratch or
try to reuse the O3 code.  There are already some shared structures in
src/cpu like branch predictors that can be reused across CPU models.

So in other words we need to decide first where the store buffer should live
(CPU or memory system) and then we can worry about how to reuse that code if
that's useful.

Steve
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-24 Thread Beckmann, Brad
So I think Steve and I are in agreement here.  We both agree that both 
speculative and non-speculative store buffers should be on the CPU side of the 
RubyPort interface.  I believe that was the same line that existed when Ruby 
tied to Opal in GEMS.  I believe the non-speculative store buffer was only a 
feature used when Opal was not attached, and it was just the simple 
SimicsProcessor driving Ruby.

The sequencer is a separate issue.  Certain functionality of the sequencer can 
probably be eliminated in gem5, but I think other functionality needs to remain 
or at least be moved to some other part of Ruby.  The sequencer performs a lot 
of protocol independent functionality including: updating the actual data 
block, performing synchronization with respect to the cache memory, translating 
m5 packets to ruby requests, checking for per-cacheblock deadlock, and 
coalescing requests to the same cache block.  The coalescing functionality can 
probably be eliminated, but I think the other functionality needs to remain.

Brad


From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On Behalf Of 
Steve Reinhardt
Sent: Thursday, February 24, 2011 1:52 PM
To: M5 Developer List
Subject: Re: [m5-dev] Store Buffer


On Thu, Feb 24, 2011 at 1:32 PM, Nilay Vaish 
ni...@cs.wisc.edumailto:ni...@cs.wisc.edu wrote:
On Thu, 24 Feb 2011, Beckmann, Brad wrote:
Steve, I think we are in agreement here and we may just be disagreeing with the 
definition of speculative.  From the Ruby perspective, I don't think it really 
matters...I don't think there is difference between a speculative store address 
request and a prefetch-with-write-intent. Also we agree that probes will need 
to be sent to O3 LSQ to support the consistency model.
My point is that if we believe this functionality is required, what is the 
extra overhead of adding a non-speculative store buffer to the O3 model as 
well?  I think that will be easier than trying to incorporate the current Ruby 
non-speculative store buffer into each protocol.

I don't know the O3 LSQ model very well, but I assume it buffers both 
speculative and non-speculative stores.  Are there two different structures in 
Ruby for that?

I think the general issue here is that the dividing line between processor 
and memory system is different in M5 than it was with GEMS. with M5 assuming 
that write buffers, redundant request filtering, etc. all happens in the 
processor.  For example, I know I've had you explain this to me multiple 
times already, but I still don't understand why we still need Ruby sequencers 
either :-).

Brad, I raise the same point that Arka raised earlier. Other processor models 
can also make use of store buffer. So, why only O3 should have a store buffer?

Nilay, I think that's a different issue... we're not saying that other CPU 
models can't have store buffers, but in practice, the simple CPU models block 
on memory accesses so they don't need one.  If the inorder model wants to add a 
store buffer (if it doesn't already have one), it would be an internal decision 
for them whether they want to write one from scratch or try to reuse the O3 
code.  There are already some shared structures in src/cpu like branch 
predictors that can be reused across CPU models.

So in other words we need to decide first where the store buffer should live 
(CPU or memory system) and then we can worry about how to reuse that code if 
that's useful.
Steve
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-23 Thread Beckmann, Brad
That's a good question.  Before we get rid of it, we should decide what is the 
interface between Ruby and the o3 LSQ.  I don't know how the current o3 LSQ 
works, but I image that we need to pass probe requests through the RubyPort to 
make it work correctly.

Does anyone with knowledge of the o3 LSQ have a suggestion?

Brad


 -Original Message-
 From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
 On Behalf Of Nilay Vaish
 Sent: Wednesday, February 23, 2011 4:51 PM
 To: m5-dev@m5sim.org
 Subject: [m5-dev] Store Buffer
 
 Brad,
 
 In case we remove libruby, what becomes of the store buffer? In fact, is
 store buffer in use?
 
 Thanks
 Nilay
 ___
 m5-dev mailing list
 m5-dev@m5sim.org
 http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-23 Thread Arkaprava Basu

Hi Brad,

  I have very little knowledge about the store buffer 
implementation in libruby and o3 CPU model. But I have following high 
level question:


Is this Store buffer in libruby only for keeping retired 
(non-speculative) stores ? If yes, then why a particular CPU models 
matters here? In-order cores (CPU models) can also use (retired) store 
buffer. I believe its is an issue of memory consistency model being 
simulated rather than tied to a particular CPU model 
(in-oder/out-of-order).  Please correct me if I am totally missing 
something here.


Thanks
Arka

On 02/23/2011 07:07 PM, Beckmann, Brad wrote:

That's a good question.  Before we get rid of it, we should decide what is the 
interface between Ruby and the o3 LSQ.  I don't know how the current o3 LSQ 
works, but I image that we need to pass probe requests through the RubyPort to 
make it work correctly.

Does anyone with knowledge of the o3 LSQ have a suggestion?

Brad



-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
On Behalf Of Nilay Vaish
Sent: Wednesday, February 23, 2011 4:51 PM
To: m5-dev@m5sim.org
Subject: [m5-dev] Store Buffer

Brad,

In case we remove libruby, what becomes of the store buffer? In fact, is
store buffer in use?

Thanks
Nilay
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


Re: [m5-dev] Store Buffer

2011-02-23 Thread Arkaprava Basu
Fundamentally, I wish to handle only non-speculative memory state within 
Ruby. Otherwise I think there might be risk of Ruby getting affected by 
the CPU model's behavior/nuances. As you suggested, Rubyport may well be 
the line dividing speculative and non-speculative state.


I haven't looked at the  Store buffer code in libruby and do not know 
how it interfaces with the protocols. So sorry, I don't have specific 
answers to your questions. I think Derek is the best person to comment 
on this as I believe he has used store buffer implementation for his 
prior research.


I do think though, that the highest level (closest to the processor) 
cache controller (i.e. *-L1Cache.sm ) need to be made aware of the store 
buffer (unless it is hacked to bypass SLICC) .


 Thanks
Arka

On 02/23/2011 11:29 PM, Beckmann, Brad wrote:

Sorry, I should have been more clear.  It fundamentally comes down to how does 
the Ruby interface help support memory consistency, especially considering more 
realistic buffering between the CPU and memory system (both speculative and 
non-speculative).  I'm pretty certain that Ruby and the RubyPort interface will 
need be changed.  I just want us to fully understand the issues before making 
any changes or removing certain options.  So are you advocating that the 
RubyPort interface be the line between speculative memory state and 
non-speculative memory state?

As far as the current Ruby store buffer goes, how does it work with the L1 
cache controller?  For instance, if the L1 cache receives a probe/forwarded 
request to a block that exists in the non-speculative store buffer, what is the 
mechanism to retrieve the up-to-date data from the buffer entry?  Is the 
mechanism protocol agnostic?

Brad



-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org] On
Behalf Of Arkaprava Basu
Sent: Wednesday, February 23, 2011 6:10 PM
To: M5 Developer List
Subject: Re: [m5-dev] Store Buffer

Hi Brad,

I have very little knowledge about the store buffer
implementation in libruby and o3 CPU model. But I have following high
level question:

Is this Store buffer in libruby only for keeping retired
(non-speculative) stores ? If yes, then why a particular CPU models
matters here? In-order cores (CPU models) can also use (retired) store
buffer. I believe its is an issue of memory consistency model being
simulated rather than tied to a particular CPU model
(in-oder/out-of-order).  Please correct me if I am totally missing
something here.

Thanks
Arka

On 02/23/2011 07:07 PM, Beckmann, Brad wrote:

That's a good question.  Before we get rid of it, we should decide

what is the interface between Ruby and the o3 LSQ.  I don't know how
the current o3 LSQ works, but I image that we need to pass probe
requests through the RubyPort to make it work correctly.

Does anyone with knowledge of the o3 LSQ have a suggestion?

Brad



-Original Message-
From: m5-dev-boun...@m5sim.org [mailto:m5-dev-boun...@m5sim.org]
On Behalf Of Nilay Vaish
Sent: Wednesday, February 23, 2011 4:51 PM
To: m5-dev@m5sim.org
Subject: [m5-dev] Store Buffer

Brad,

In case we remove libruby, what becomes of the store buffer? In

fact, is

store buffer in use?

Thanks
Nilay
___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev


___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev

___
m5-dev mailing list
m5-dev@m5sim.org
http://m5sim.org/mailman/listinfo/m5-dev