[gem5-users] Simulating an additional cache after the LLC in gem5

2023-07-05 Thread John Smith via gem5-users
I want to simulate a cache which intercepts the address from the LLC to the
memory controller and uses that address to update certain information in
the cache. Could anyone help me with how I could go about doing this?

Regards,
Vincent
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Browsing the gem5 codebase

2023-07-05 Thread Ayaz Akram via gem5-users
Hi John,

You can open gem5 code base in Microsoft VS Code to use different code
navigation options that work for any C/C++ project.

-Ayaz

On Wed, Jul 5, 2023 at 9:29 AM John Smith via gem5-users <
gem5-users@gem5.org> wrote:

> Is there a way to make browsing the gem5 codebase and performing
> functionalities like 'Go to Definition' easier?
>
> Thanks,
> John
> ___
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
>
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Question about InOrder cpu models

2023-07-05 Thread Eliot Moss via gem5-users

On 7/4/2023 7:17 PM, Eliot Moss via gem5-users wrote:

Dear gem5-ers --

I am thinking of trying to put together something that roughly models ARM's
R82, which is an 8-stage, width 3, in order cpu.  (It's also not a single
thing, but has numerous options you choose, and then set up RTL and can have
your design manufactured.)  I see that there are three non-SMT and one SMT in
order pipeline models, but I'm not clear how I would use them -- swap them in
for the one that does not have 5, 9, or smt in its name?  Or what?  I do know
that I'll need to put together a new system model that uses the ARM isa and is
at least slightly extended from InOrderCPU.py/  Any other things to watch out
for?  Thanks - Eliot


Following up my own question a bit ...

Is InOrderCPU (cpu/inorder) deprecated or something?  Even adding InOrderCPU
to CPU_MODELS in build_opts/ARM does not cause the inorder directory to
compile.  Not sure how to make it happen.

Meanwhile, there is MinorCPU (cpu/minor), which seem perhaps intended to
replace inorder.  Is that right?

Regards - Eliot
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org


[gem5-users] Re: Replacing CPU model in GPU-FS

2023-07-05 Thread Matt Sinclair via gem5-users
Answers:

1.  Yes, I believe so.  However, I have never personally tried using the O3
model with the GPU.  Matt P has, I believe, so he may have better feedback
there.

2.  I have not followed the chain of events all the way through here, but I
*believe* that the builtin you highlighted is used at the compiler level by
HIPCC/LLVM to generate the appropriate assembly for a given AMD GPU.  In
this case (gfx900), I believe there is a 1-1 correlation with this builtin
becoming an s_sleep assembly instruction (maybe with the addition of a
v_mov-type instruction before it to set the register to the appropriate
sleep value).  I am not aware of s_sleep()'s builtin requiring OS calls (or
emulation).  But what you have described is more generally the issue with
SE mode (CPU, GPU, etc.) -- because SE mode does not model OS calls, the
fidelity of anything involving the OS will be less.  Perhaps a trite way to
answer this is: if the fidelity of the OS calls is important for the
applications you are studying, then I strongly recommend using FS mode.

Hope this helps,
Matt S.

On Tue, Jul 4, 2023 at 6:01 AM Anoop Mysore  wrote:

> Thank you so much for the kind and detailed explanations!
>
> Just to clarify: I can use the APU config (apu_se.py) and switch out to an
> O3 CPU, and I would still have the detailed GPU model, and the disconnected
> Ruby model that synchronizes between CPU and GPU at the system-level
> directory -- is that correct?
>
> Last question: when using the APU config for simulating HeteroSync which,
> for example, has a sleep mutex primitive that invokes a
> __builtin_amdgcn_s_sleep(), is there any OS involvement? If yes, would SE
> mode's emulation of those syscalls inexorably sacrifice any fidelity that
> could be argued leads to inaccurate evaluations of heterogeneous coherence
> implementations? Or are any there other factors of insufficient fidelity
> that might be important in this regard?
>
>
> On Fri, Jun 30, 2023 at 7:40 PM Matt Sinclair <
> mattdsinclair.w...@gmail.com> wrote:
>
>> Just to follow-up on 4 and 5:
>>
>> 4.  The synchronization should happen at the directory-level here, since
>> this is the first level of the memory system where both the CPU and GPU are
>> connected.  However, I have not tested if the programmer sets the GLC bit
>> (which should perform the atomic at the GPU's LLC) if Ruby has the
>> functionality to send invalidations as appropriate to allow this.  I
>> suspect it would work as is, but would have to check ...
>>
>> 5.  Yeah, for the reasons Matt P already stated O3 is not currently
>> supported in GPUFS.  So GPUSE would be a better option here.  Yes, you can
>> use the apu_se.py script as the base script for running GPUSE experiments.
>> There are a number of examples on gem5-resources for how to get started
>> with this (including HeteroSync), but I normally recommend starting with
>> square if you haven't used the GPU model before:
>> https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/develop/src/gpu/square/.
>> In terms of support for synchronization at different levels of the memory
>> hierarchy, but default the GPU VIPER coherence protocol assumes that all
>> synchronization happens at the system-level (at the directory, in the
>> current implementation).  However, one of my students will be pushing
>> updates (hopefully today) that allow non-system level support (e.g., the
>> GPU LLC "GLC" level as mentioned above).  It sounds like you want to change
>> the cache hierarchy and coherence protocol to add another level of cache
>> (the L3) before the directory and after the CPU/GPU LLCs?  If so, you would
>> need to change the current Ruby support to add this additional level and
>> the appropriate transitions to do so.  However, if you instead meant that
>> you are thinking of the directory level as synchronizing between the CPU
>> and GPU, then you could use the support as is without any changes (I think).
>>
>> Hope this helps,
>> Matt S.
>>
>> On Fri, Jun 30, 2023 at 12:05 PM Poremba, Matthew via gem5-users <
>> gem5-users@gem5.org> wrote:
>>
>>> [Public]
>>>
>>> Hi,
>>>
>>>
>>>
>>>
>>>
>>> No worries about the questions! I will try to answer them all, so this
>>> will be a long email 😊:
>>>
>>>
>>>
>>> The disconnected (or disjoint) Ruby network is essentially the same as
>>> the APU Ruby network used in SE mode -  That is, it combines two Ruby
>>> protocols in one protocol (MOESI_AMD_base and GPU_VIPER).  They are
>>> disjointed because there are no paths / network links between the GPU and
>>> CPU side, simulating a discrete GPU. These protocols work together because
>>> they use the same network messages / virtual channels to the directory –
>>> Basically you cannot simply drop in another CPU protocol and have it work.
>>>
>>>
>>>
>>> Atomic CPU is working **very** recently – As in this week.  It is on
>>> review board right now and I believe might be part of the gem5 v23.0
>>> release.  However, the reason Atomic and KVM CPUs are requi

[gem5-users] Browsing the gem5 codebase

2023-07-05 Thread John Smith via gem5-users
Is there a way to make browsing the gem5 codebase and performing
functionalities like 'Go to Definition' easier?

Thanks,
John
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org