[gem5-users] Re: GCN3/hip constant memory

2020-08-18 Thread Matt Sinclair via gem5-users
Hi Dan,

Attempting to answers your questions in order:

- Yes, by data cache I meant a global memory array.  You've highlighted the
issue exactly though, by making the array a global memory array instead, it
will now be subject to thrashing with other global memory data.  You could
try experimenting on a real GPU though to see what the performance hit is.
That might help you decide if it's worth adding the support.

- I am not an expert in the SQC (Tony and Brad, CC'd, are) but I believe it
only is used for a) instructions and b) scalars.  I am not aware of a way
to put non-scalar data in it.

- I'm not sure if shared memory here is referring to the traditional shared
memory like in CPUs (e.g., the global memory) or what NVIDIA refers to as
shared memory (e.g., the per-CU scratchpads)?  If it's the traditional
definition of shared memory, then it resides in the standard main memory
place, and flows through the caches to the cores.  If you meant the
scratchpads, AMD refers to those as local data stores (LDSs), and they are
co-located with the CUs (e.g., Figure 2 in
https://ieeexplore.ieee.org/document/8327041).

Hope this helps,
Matt

On Tue, Aug 18, 2020 at 12:32 PM Daniel Gerzhoy 
wrote:

> Matt,
>
> Thanks for the detailed response. Yeah that sounds pretty involved, I
> probably won't go down that path unless I see no other way.
>
> When you say the data cache do you mean make it a global memory array?
> This is actually what I already have, and I wanted to keep the "constant"
> data from getting evicted by other global memory data.
>
> How does the SQC work in terms of data rather than instructions? Could I
> have data go in the SQC?
>
> On that note, where does "Shared" memory reside?
>
> Thanks,
>
> Dan
>
> On Tue, Aug 18, 2020 at 12:41 PM Matt Sinclair via gem5-users <
> gem5-users@gem5.org> wrote:
>
>> Hi Dan,
>>
>> Tony will have to confirm, but I believe AMD didn’t add support for
>> constant memory because none of the applications they looked at used it.
>> The mincore error is kind of a catch all, saying that something bad
>> happened and you went down a failure path.
>>
>> Assuming the above is correct, if you wanted to add support for constant
>> memory, you’d need to start by adding the appropriate syscall support.  I
>> suspect the reason you are hitting the mincore error is because your
>> program attempted to run an unimplemented syscall and didn’t know what to
>> do.  If you want to go down this route, I would suggest running with a
>> debug build of gem5 and using gdb to try and trace back where the mincore
>> failure is coming from, but from personal experience I can tell you this is
>> not always 100% effective.  Another other option would be to use gdb in the
>> application itself and step through it, seeing what ioctls the
>> hipMemcpyToSymbol is using under the hood.  Anyways, in gem5 you would also
>> need to instantiate a separate constant cache and connect that to the
>> existing memory hierarchy in the appropriate places.  So, as you can
>> probably tell, this will likely be a fairly intensive process to get
>> working though.
>>
>> The alternative would be to change your program to use the data cache for
>> the array instead of using the constant cache.  This would potentially hurt
>> the performance of the application, but wouldn’t require adding any new
>> features to the simulator.
>>
>> To answer your other questions more directly:
>>
>> - the constant memory allocations shouldn’t go to the scalar cache or
>> data cache.  It uses a separate cache, the constant cache.  If you look at
>> slides on GCN3 (e.g., slide 23 of:
>> https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf
>> or Figure 1.1 in
>> http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf),
>> you’ll see a separate cache from the I$, D$, and scalar cache for constants.
>> - See slide 60:
>> http://www.m5sim.org/wiki/images/1/19/AMD_gem5_APU_simulator_isca_2018_gem5_wiki.pdf
>> for the SQC explanation.
>>
>> Thanks,
>> Matt
>>
>> On Tue, Aug 18, 2020 at 8:37 AM Daniel Gerzhoy via gem5-users <
>> gem5-users@gem5.org> wrote:
>>
>>> Hey all,
>>>
>>> Is there a way to use constant memory in the GPU Model right now?
>>>
>>> Using the
>>>
>>> *__constant__ float variable[SIZE];*
>>>
>>> and
>>>
>>> *hipMemcpyToSymbol(...)*
>>>
>>> results in a
>>>
>>> *fatal: syscall mincore (#27) unimplemented.*
>>>
>>> I've been looking through the code to find a way, but I haven't yet.
>>> I guess a clarifying question might be: which cache does constant memory
>>> go to? the SQC? Scalar Cache? (Those two actually seem to have the same
>>> controller)
>>>
>>> Thanks,
>>>
>>> Dan Gerzhoy
>>>
>>>
>>> ___
>>>
>>> gem5-users mailing list -- gem5-users@gem5.org
>>>
>>> To unsubscribe send an email to gem5-users-le...@gem5.org
>>>
>>> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>>
>> 

[gem5-users] Re: GCN3/hip constant memory

2020-08-18 Thread Daniel Gerzhoy via gem5-users
Matt,

Thanks for the detailed response. Yeah that sounds pretty involved, I
probably won't go down that path unless I see no other way.

When you say the data cache do you mean make it a global memory array? This
is actually what I already have, and I wanted to keep the "constant" data
from getting evicted by other global memory data.

How does the SQC work in terms of data rather than instructions? Could I
have data go in the SQC?

On that note, where does "Shared" memory reside?

Thanks,

Dan

On Tue, Aug 18, 2020 at 12:41 PM Matt Sinclair via gem5-users <
gem5-users@gem5.org> wrote:

> Hi Dan,
>
> Tony will have to confirm, but I believe AMD didn’t add support for
> constant memory because none of the applications they looked at used it.
> The mincore error is kind of a catch all, saying that something bad
> happened and you went down a failure path.
>
> Assuming the above is correct, if you wanted to add support for constant
> memory, you’d need to start by adding the appropriate syscall support.  I
> suspect the reason you are hitting the mincore error is because your
> program attempted to run an unimplemented syscall and didn’t know what to
> do.  If you want to go down this route, I would suggest running with a
> debug build of gem5 and using gdb to try and trace back where the mincore
> failure is coming from, but from personal experience I can tell you this is
> not always 100% effective.  Another other option would be to use gdb in the
> application itself and step through it, seeing what ioctls the
> hipMemcpyToSymbol is using under the hood.  Anyways, in gem5 you would also
> need to instantiate a separate constant cache and connect that to the
> existing memory hierarchy in the appropriate places.  So, as you can
> probably tell, this will likely be a fairly intensive process to get
> working though.
>
> The alternative would be to change your program to use the data cache for
> the array instead of using the constant cache.  This would potentially hurt
> the performance of the application, but wouldn’t require adding any new
> features to the simulator.
>
> To answer your other questions more directly:
>
> - the constant memory allocations shouldn’t go to the scalar cache or data
> cache.  It uses a separate cache, the constant cache.  If you look at
> slides on GCN3 (e.g., slide 23 of:
> https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf
> or Figure 1.1 in
> http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf),
> you’ll see a separate cache from the I$, D$, and scalar cache for constants.
> - See slide 60:
> http://www.m5sim.org/wiki/images/1/19/AMD_gem5_APU_simulator_isca_2018_gem5_wiki.pdf
> for the SQC explanation.
>
> Thanks,
> Matt
>
> On Tue, Aug 18, 2020 at 8:37 AM Daniel Gerzhoy via gem5-users <
> gem5-users@gem5.org> wrote:
>
>> Hey all,
>>
>> Is there a way to use constant memory in the GPU Model right now?
>>
>> Using the
>>
>> *__constant__ float variable[SIZE];*
>>
>> and
>>
>> *hipMemcpyToSymbol(...)*
>>
>> results in a
>>
>> *fatal: syscall mincore (#27) unimplemented.*
>>
>> I've been looking through the code to find a way, but I haven't yet.
>> I guess a clarifying question might be: which cache does constant memory
>> go to? the SQC? Scalar Cache? (Those two actually seem to have the same
>> controller)
>>
>> Thanks,
>>
>> Dan Gerzhoy
>>
>>
>> ___
>>
>> gem5-users mailing list -- gem5-users@gem5.org
>>
>> To unsubscribe send an email to gem5-users-le...@gem5.org
>>
>> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>
> ___
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Re: GCN3/hip constant memory

2020-08-18 Thread Matt Sinclair via gem5-users
Hi Dan,

Tony will have to confirm, but I believe AMD didn’t add support for
constant memory because none of the applications they looked at used it.
The mincore error is kind of a catch all, saying that something bad
happened and you went down a failure path.

Assuming the above is correct, if you wanted to add support for constant
memory, you’d need to start by adding the appropriate syscall support.  I
suspect the reason you are hitting the mincore error is because your
program attempted to run an unimplemented syscall and didn’t know what to
do.  If you want to go down this route, I would suggest running with a
debug build of gem5 and using gdb to try and trace back where the mincore
failure is coming from, but from personal experience I can tell you this is
not always 100% effective.  Another other option would be to use gdb in the
application itself and step through it, seeing what ioctls the
hipMemcpyToSymbol is using under the hood.  Anyways, in gem5 you would also
need to instantiate a separate constant cache and connect that to the
existing memory hierarchy in the appropriate places.  So, as you can
probably tell, this will likely be a fairly intensive process to get
working though.

The alternative would be to change your program to use the data cache for
the array instead of using the constant cache.  This would potentially hurt
the performance of the application, but wouldn’t require adding any new
features to the simulator.

To answer your other questions more directly:

- the constant memory allocations shouldn’t go to the scalar cache or data
cache.  It uses a separate cache, the constant cache.  If you look at
slides on GCN3 (e.g., slide 23 of:
https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf
or Figure 1.1 in
http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf),
you’ll see a separate cache from the I$, D$, and scalar cache for constants.
- See slide 60:
http://www.m5sim.org/wiki/images/1/19/AMD_gem5_APU_simulator_isca_2018_gem5_wiki.pdf
for the SQC explanation.

Thanks,
Matt

On Tue, Aug 18, 2020 at 8:37 AM Daniel Gerzhoy via gem5-users <
gem5-users@gem5.org> wrote:

> Hey all,
>
> Is there a way to use constant memory in the GPU Model right now?
>
> Using the
>
> *__constant__ float variable[SIZE];*
>
> and
>
> *hipMemcpyToSymbol(...)*
>
> results in a
>
> *fatal: syscall mincore (#27) unimplemented.*
>
> I've been looking through the code to find a way, but I haven't yet.
> I guess a clarifying question might be: which cache does constant memory
> go to? the SQC? Scalar Cache? (Those two actually seem to have the same
> controller)
>
> Thanks,
>
> Dan Gerzhoy
>
>
> ___
>
> gem5-users mailing list -- gem5-users@gem5.org
>
> To unsubscribe send an email to gem5-users-le...@gem5.org
>
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s