[gem5-users] Re: Question Regarding L1 Cache Transient States handling Load Hit in Ruby MOESI CMP Directory protocol

2023-03-22 Thread 章志元 via gem5-users
Hello Jason,
Thanks a lot for your reply! I'm indeed aware that by the cache subsystem doing 
this, it will give CPU the view of the load as happening before the store. 
However, in view of the CPU commit order (the instruction order seen by the 
CPU), the store must have committed before the load, since the store must have 
been committed before its store miss request could reach L1 cache, thus causing 
the cacheline state to switch to SM state. I'm actually wondering if the 
difference between: 
a. the CPU store->load commit order, and
b. the CPU's view of the order of these instructions accessing memory
will raise a violation. 

Thanks again,
Zhang Zhiyuan

-Original Messages-
From:"Jason Lowe-Power via gem5-users" 
Sent Time:2023-03-22 23:20:25 (Wednesday)
To: "The gem5 Users mailing list" 
Cc: "章志元" <18300750...@fudan.edu.cn>, "Jason Lowe-Power" 
Subject: [gem5-users] Re: Question Regarding L1 Cache Transient States handling 
Load Hit in Ruby MOESI CMP Directory protocol


Hello,


This is a great question!



The short answer is I believe that the coherence protocol is correct. (Though, 
there could always be unexpected bugs.)


The slightly longer answer: You are probably seeing that the store happens 
before the load in "real" time. However, in the processors' view (i.e., 
*logical* time), the load is actually happening before the store. As long as 
the processors are correctly implementing their consistency models (e.g., if 
they are sequentially consistent then they don't allow any reorderings between 
load and store instructions within each thread), then as long as it *appears* 
that the load completed before the store, then it's a correct implementation. 
To put it another way, if the thread doing the load cannot tell that the load 
happened after the store (in real time) then it is safe.


It's something like the Lamport Clock: 
https://en.wikipedia.org/wiki/Lamport_timestamp


We have a saying in English: "If a tree falls in a forest and no one is there 
to hear it, does it make a sound?" Similarly, if a thread does a store to an 
address, but no other thread can tell what the ordering needs to be, it's OK to 
reorder it :).


Cheers,
Jason


On Tue, Mar 21, 2023 at 11:50 PM 章志元 via gem5-users  wrote:

Hi all,
  I've been looking into the default MOESI CMP Directory Protocol, and it came 
to my attention that, regarding SM states in L1 Cache (Transient state during a 
Shared to Exclusive Upgrade due to a store miss), when a load arrives from the 
local core (which hits since the Cache is technically still in Shared state), 
the cache will return the old Shared Datablk as its load hit result. Will it 
cause incoherence issues in memory ordering between the core and the memory 
system, since the CPU commits the store first and then commit the load 
returning the old data, but the memory system sees the load hit finish first, 
and then see the GETX finish?
  Also I already speculate that such loads will probably not arrive at the L1 
Cache controller, since it would be blocked or forwarded with newer data due to 
outstanding stores in the lsq or the mandatory queue. I'm just wondering if the 
cache protocol itself is solid in terms of request ordering. 
  Thanks in advance!
Zhang Zhiyuan
2023.3.22
--
姓名:章志元
手机:17717877306
邮箱:zhiyuanzhan...@fudan.edu.cn


___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


--
姓名:章志元
手机:17717877306
邮箱:zhiyuanzhan...@fudan.edu.cn




___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Retired instructions versus ticks

2023-03-22 Thread Priyanka Ankolekar via gem5-users
Thank you, Eliot.
I think this would give me what I need.

Priyanka.

On Wed, Mar 22, 2023, 11:55 AM Eliot Moss  wrote:

> On 3/22/2023 12:09 PM, Priyanka Ankolekar via gem5-users wrote:
> >
> > Regarding the other part of your email:
> > Let me begin by saying I am a novice to both RISCV and gem5.
> > I have a RISCV RTL with a certain config. I have set up gem5 to match
> that configuration. I want to
> > make sure that they are indeed equivalent so that I can run some
> experiments on gem5 (instead of on
> > RTL) since that would be faster and easier. In order to establish that
> equivalence, I am running a
> > simple benchmark test on both RTL and gem5. The final numbers like
> DMIPS/MHz etc match fairly
> > closely. But I want to dig further to see if the retired instruction/s
> at a given tick, for both
> > these setups, are also a close match.
> > Hence the questions.
>
> My suggestion would be to:
>
> - Read the CSR at points of interest - one hopes not *too* many points to
> avoid being overwhelmed
> with output.  Do this in gem5 and in your RTL.
>
> - Add code to gem5 to print the value the tick and the value read when the
> CSR is read.  A DNPRINTF
> call would serve nicely.  grep can help you find where the right code is
> using the register name.
>
> Would this do the trick?
>
> Best - EM
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Retired instructions versus ticks

2023-03-22 Thread Eliot Moss via gem5-users

On 3/22/2023 12:09 PM, Priyanka Ankolekar via gem5-users wrote:


Regarding the other part of your email:
Let me begin by saying I am a novice to both RISCV and gem5.
I have a RISCV RTL with a certain config. I have set up gem5 to match that configuration. I want to 
make sure that they are indeed equivalent so that I can run some experiments on gem5 (instead of on 
RTL) since that would be faster and easier. In order to establish that equivalence, I am running a 
simple benchmark test on both RTL and gem5. The final numbers like DMIPS/MHz etc match fairly 
closely. But I want to dig further to see if the retired instruction/s at a given tick, for both 
these setups, are also a close match.

Hence the questions.


My suggestion would be to:

- Read the CSR at points of interest - one hopes not *too* many points to avoid being overwhelmed 
with output.  Do this in gem5 and in your RTL.


- Add code to gem5 to print the value the tick and the value read when the CSR is read.  A DNPRINTF 
call would serve nicely.  grep can help you find where the right code is using the register name.


Would this do the trick?

Best - EM
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Retired instructions versus ticks

2023-03-22 Thread Priyanka Ankolekar via gem5-users
Regarding the other part of your email:
Let me begin by saying I am a novice to both RISCV and gem5.
I have a RISCV RTL with a certain config. I have set up gem5 to match that
configuration. I want to make sure that they are indeed equivalent so that
I can run some experiments on gem5 (instead of on RTL) since that would be
faster and easier. In order to establish that equivalence, I am running a
simple benchmark test on both RTL and gem5. The final numbers like
DMIPS/MHz etc match fairly closely. But I want to dig further to see if the
retired instruction/s at a given tick, for both these setups, are also a
close match.
Hence the questions.


On Wed, Mar 22, 2023 at 8:23 AM Eliot Moss  wrote:

> On 3/22/2023 11:11 AM, Priyanka Ankolekar wrote:
> > Sorry, I should have clarified. I am using the RISCV ISA in gem5.
>
> (As you could have done,) I checked the gem5 sources,
> and it *does* model that register, returning totalInsts
> as gem5 calculates that.  Presumably that is the same as
> statistics will give you, but you could read it on the
> fly.  Not sure if the instruction to read that is
> privileged, though if it is, you could (as a hack)
> change gem5 to allow it to be read in user mode.
>
> Cheers - EM
>
> PS: You did not respond to the other part of that I
> said: What is it that you are really trying to do that
> the previous suggestions do not satisfy?  Cheers - EM
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Retired instructions versus ticks

2023-03-22 Thread Eliot Moss via gem5-users

On 3/22/2023 11:11 AM, Priyanka Ankolekar wrote:

Sorry, I should have clarified. I am using the RISCV ISA in gem5.


(As you could have done,) I checked the gem5 sources,
and it *does* model that register, returning totalInsts
as gem5 calculates that.  Presumably that is the same as
statistics will give you, but you could read it on the
fly.  Not sure if the instruction to read that is
privileged, though if it is, you could (as a hack)
change gem5 to allow it to be read in user mode.

Cheers - EM

PS: You did not respond to the other part of that I
said: What is it that you are really trying to do that
the previous suggestions do not satisfy?  Cheers - EM
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Question Regarding L1 Cache Transient States handling Load Hit in Ruby MOESI CMP Directory protocol

2023-03-22 Thread Jason Lowe-Power via gem5-users
Hello,

This is a great question!

The short answer is I believe that the coherence protocol is correct.
(Though, there could always be unexpected bugs.)

The slightly longer answer: You are probably seeing that the store happens
before the load in "real" time. However, in the processors' view (i.e.,
*logical* time), the load is actually happening before the store. As long
as the processors are correctly implementing their consistency models
(e.g., if they are sequentially consistent then they don't allow any
reorderings between load and store instructions within each thread), then
as long as it *appears* that the load completed before the store, then it's
a correct implementation. To put it another way, if the thread doing the
load cannot tell that the load happened after the store (in real time) then
it is safe.

It's something like the Lamport Clock:
https://en.wikipedia.org/wiki/Lamport_timestamp

We have a saying in English: "If a tree falls in a forest and no one is
there to hear it, does it make a sound?" Similarly, if a thread does a
store to an address, but no other thread can tell what the ordering needs
to be, it's OK to reorder it :).

Cheers,
Jason

On Tue, Mar 21, 2023 at 11:50 PM 章志元 via gem5-users 
wrote:

> Hi all,
>   I've been looking into the default MOESI CMP Directory Protocol, and it
> came to my attention that, regarding SM states in L1 Cache (Transient state
> during a Shared to Exclusive Upgrade due to a store miss), when a load
> arrives from the local core (which hits since the Cache is technically
> still in Shared state), the cache will return the old Shared Datablk as its
> load hit result. Will it cause incoherence issues in memory ordering
> between the core and the memory system, since the CPU commits the store
> first and then commit the load returning the old data, but the memory
> system sees the load hit finish first, and then see the GETX finish?
>   Also I already speculate that such loads will probably not arrive at the
> L1 Cache controller, since it would be blocked or forwarded with newer data
> due to outstanding stores in the lsq or the mandatory queue. I'm just
> wondering if the cache protocol itself is solid in terms of request
> ordering.
>   Thanks in advance!
> Zhang Zhiyuan
> 2023.3.22
> --
> 姓名:章志元
> 手机:17717877306
> 邮箱:zhiyuanzhan...@fudan.edu.cn
>
>
> ___
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Retired instructions versus ticks

2023-03-22 Thread Priyanka Ankolekar via gem5-users
Sorry, I should have clarified. I am using the RISCV ISA in gem5.

On Wed, Mar 22, 2023, 5:44 AM Eliot Moss  wrote:

> On 3/22/2023 8:37 AM, Priyanka Ankolekar via gem5-users wrote:
> > Thank you, Eliot.
> >
> > Is there a way to probe minstret CSR to get the retired instructions?
>
> ??  What ISA are you talking about?
>
> I doubt that gem5 would model such details of a processor
> architecture.  Maybe you should back up a little and tell
> us what you're really trying to do, since neither the
> retired instructions stats nor a full trace seem to meet
> your need ...
>
> Best - Eliot Moss
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Retired instructions versus ticks

2023-03-22 Thread Eliot Moss via gem5-users

On 3/22/2023 8:37 AM, Priyanka Ankolekar via gem5-users wrote:

Thank you, Eliot.

Is there a way to probe minstret CSR to get the retired instructions?


??  What ISA are you talking about?

I doubt that gem5 would model such details of a processor
architecture.  Maybe you should back up a little and tell
us what you're really trying to do, since neither the
retired instructions stats nor a full trace seem to meet
your need ...

Best - Eliot Moss
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Retired instructions versus ticks

2023-03-22 Thread Priyanka Ankolekar via gem5-users
Thank you, Eliot.

Is there a way to probe minstret CSR to get the retired instructions?

Thanks
Priyanka.

On Mon, Mar 20, 2023, 2:45 PM Eliot Moss  wrote:

> On 3/20/2023 5:05 PM, Priyanka Ankolekar via gem5-users wrote:
> > Hi Eliot,
> > (Picking this up again after a while.) :-)
> >
> > Thank you for your detailed answer. I was able to get a lot of useful
> data points from these statistics.
> > Is there a way to get what instruction was retired/committed and when
> (tick)?
>
> That would be a full trace.  For that, look into the various debug flags.
>
> Be prepared for a LOT of output!!
>
> Best - EM
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Question Regarding L1 Cache Transient States handling Load Hit in Ruby MOESI CMP Directory protocol

2023-03-22 Thread 章志元 via gem5-users
Hi all,
  I've been looking into the default MOESI CMP Directory Protocol, and it came 
to my attention that, regarding SM states in L1 Cache (Transient state during a 
Shared to Exclusive Upgrade due to a store miss), when a load arrives from the 
local core (which hits since the Cache is technically still in Shared state), 
the cache will return the old Shared Datablk as its load hit result. Will it 
cause incoherence issues in memory ordering between the core and the memory 
system, since the CPU commits the store first and then commit the load 
returning the old data, but the memory system sees the load hit finish first, 
and then see the GETX finish?
  Also I already speculate that such loads will probably not arrive at the L1 
Cache controller, since it would be blocked or forwarded with newer data due to 
outstanding stores in the lsq or the mandatory queue. I'm just wondering if the 
cache protocol itself is solid in terms of request ordering. 
  Thanks in advance!
Zhang Zhiyuan
2023.3.22
--
姓名:章志元
手机:17717877306
邮箱:zhiyuanzhan...@fudan.edu.cn




___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org