[gem5-users] Re: GCN3/hip constant memory
Hi Dan, Attempting to answers your questions in order: - Yes, by data cache I meant a global memory array. You've highlighted the issue exactly though, by making the array a global memory array instead, it will now be subject to thrashing with other global memory data. You could try experimenting on a real GPU though to see what the performance hit is. That might help you decide if it's worth adding the support. - I am not an expert in the SQC (Tony and Brad, CC'd, are) but I believe it only is used for a) instructions and b) scalars. I am not aware of a way to put non-scalar data in it. - I'm not sure if shared memory here is referring to the traditional shared memory like in CPUs (e.g., the global memory) or what NVIDIA refers to as shared memory (e.g., the per-CU scratchpads)? If it's the traditional definition of shared memory, then it resides in the standard main memory place, and flows through the caches to the cores. If you meant the scratchpads, AMD refers to those as local data stores (LDSs), and they are co-located with the CUs (e.g., Figure 2 in https://ieeexplore.ieee.org/document/8327041). Hope this helps, Matt On Tue, Aug 18, 2020 at 12:32 PM Daniel Gerzhoy wrote: > Matt, > > Thanks for the detailed response. Yeah that sounds pretty involved, I > probably won't go down that path unless I see no other way. > > When you say the data cache do you mean make it a global memory array? > This is actually what I already have, and I wanted to keep the "constant" > data from getting evicted by other global memory data. > > How does the SQC work in terms of data rather than instructions? Could I > have data go in the SQC? > > On that note, where does "Shared" memory reside? > > Thanks, > > Dan > > On Tue, Aug 18, 2020 at 12:41 PM Matt Sinclair via gem5-users < > gem5-users@gem5.org> wrote: > >> Hi Dan, >> >> Tony will have to confirm, but I believe AMD didn’t add support for >> constant memory because none of the applications they looked at used it. >> The mincore error is kind of a catch all, saying that something bad >> happened and you went down a failure path. >> >> Assuming the above is correct, if you wanted to add support for constant >> memory, you’d need to start by adding the appropriate syscall support. I >> suspect the reason you are hitting the mincore error is because your >> program attempted to run an unimplemented syscall and didn’t know what to >> do. If you want to go down this route, I would suggest running with a >> debug build of gem5 and using gdb to try and trace back where the mincore >> failure is coming from, but from personal experience I can tell you this is >> not always 100% effective. Another other option would be to use gdb in the >> application itself and step through it, seeing what ioctls the >> hipMemcpyToSymbol is using under the hood. Anyways, in gem5 you would also >> need to instantiate a separate constant cache and connect that to the >> existing memory hierarchy in the appropriate places. So, as you can >> probably tell, this will likely be a fairly intensive process to get >> working though. >> >> The alternative would be to change your program to use the data cache for >> the array instead of using the constant cache. This would potentially hurt >> the performance of the application, but wouldn’t require adding any new >> features to the simulator. >> >> To answer your other questions more directly: >> >> - the constant memory allocations shouldn’t go to the scalar cache or >> data cache. It uses a separate cache, the constant cache. If you look at >> slides on GCN3 (e.g., slide 23 of: >> https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf >> or Figure 1.1 in >> http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf), >> you’ll see a separate cache from the I$, D$, and scalar cache for constants. >> - See slide 60: >> http://www.m5sim.org/wiki/images/1/19/AMD_gem5_APU_simulator_isca_2018_gem5_wiki.pdf >> for the SQC explanation. >> >> Thanks, >> Matt >> >> On Tue, Aug 18, 2020 at 8:37 AM Daniel Gerzhoy via gem5-users < >> gem5-users@gem5.org> wrote: >> >>> Hey all, >>> >>> Is there a way to use constant memory in the GPU Model right now? >>> >>> Using the >>> >>> *__constant__ float variable[SIZE];* >>> >>> and >>> >>> *hipMemcpyToSymbol(...)* >>> >>> results in a >>> >>> *fatal: syscall mincore (#27) unimplemented.* >>> >>> I've been looking through the code to find a way, but I haven't yet. >>> I guess a clarifying question might be: which cache does constant memory >>> go to? the SQC? Scalar Cache? (Those two actually seem to have the same >>> controller) >>> >>> Thanks, >>> >>> Dan Gerzhoy >>> >>> >>> ___ >>> >>> gem5-users mailing list -- gem5-users@gem5.org >>> >>> To unsubscribe send an email to gem5-users-le...@gem5.org >>> >>> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s >> >>
[gem5-users] Re: GCN3/hip constant memory
Matt, Thanks for the detailed response. Yeah that sounds pretty involved, I probably won't go down that path unless I see no other way. When you say the data cache do you mean make it a global memory array? This is actually what I already have, and I wanted to keep the "constant" data from getting evicted by other global memory data. How does the SQC work in terms of data rather than instructions? Could I have data go in the SQC? On that note, where does "Shared" memory reside? Thanks, Dan On Tue, Aug 18, 2020 at 12:41 PM Matt Sinclair via gem5-users < gem5-users@gem5.org> wrote: > Hi Dan, > > Tony will have to confirm, but I believe AMD didn’t add support for > constant memory because none of the applications they looked at used it. > The mincore error is kind of a catch all, saying that something bad > happened and you went down a failure path. > > Assuming the above is correct, if you wanted to add support for constant > memory, you’d need to start by adding the appropriate syscall support. I > suspect the reason you are hitting the mincore error is because your > program attempted to run an unimplemented syscall and didn’t know what to > do. If you want to go down this route, I would suggest running with a > debug build of gem5 and using gdb to try and trace back where the mincore > failure is coming from, but from personal experience I can tell you this is > not always 100% effective. Another other option would be to use gdb in the > application itself and step through it, seeing what ioctls the > hipMemcpyToSymbol is using under the hood. Anyways, in gem5 you would also > need to instantiate a separate constant cache and connect that to the > existing memory hierarchy in the appropriate places. So, as you can > probably tell, this will likely be a fairly intensive process to get > working though. > > The alternative would be to change your program to use the data cache for > the array instead of using the constant cache. This would potentially hurt > the performance of the application, but wouldn’t require adding any new > features to the simulator. > > To answer your other questions more directly: > > - the constant memory allocations shouldn’t go to the scalar cache or data > cache. It uses a separate cache, the constant cache. If you look at > slides on GCN3 (e.g., slide 23 of: > https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf > or Figure 1.1 in > http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf), > you’ll see a separate cache from the I$, D$, and scalar cache for constants. > - See slide 60: > http://www.m5sim.org/wiki/images/1/19/AMD_gem5_APU_simulator_isca_2018_gem5_wiki.pdf > for the SQC explanation. > > Thanks, > Matt > > On Tue, Aug 18, 2020 at 8:37 AM Daniel Gerzhoy via gem5-users < > gem5-users@gem5.org> wrote: > >> Hey all, >> >> Is there a way to use constant memory in the GPU Model right now? >> >> Using the >> >> *__constant__ float variable[SIZE];* >> >> and >> >> *hipMemcpyToSymbol(...)* >> >> results in a >> >> *fatal: syscall mincore (#27) unimplemented.* >> >> I've been looking through the code to find a way, but I haven't yet. >> I guess a clarifying question might be: which cache does constant memory >> go to? the SQC? Scalar Cache? (Those two actually seem to have the same >> controller) >> >> Thanks, >> >> Dan Gerzhoy >> >> >> ___ >> >> gem5-users mailing list -- gem5-users@gem5.org >> >> To unsubscribe send an email to gem5-users-le...@gem5.org >> >> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s > > ___ > gem5-users mailing list -- gem5-users@gem5.org > To unsubscribe send an email to gem5-users-le...@gem5.org > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
[gem5-users] Re: GCN3/hip constant memory
Hi Dan, Tony will have to confirm, but I believe AMD didn’t add support for constant memory because none of the applications they looked at used it. The mincore error is kind of a catch all, saying that something bad happened and you went down a failure path. Assuming the above is correct, if you wanted to add support for constant memory, you’d need to start by adding the appropriate syscall support. I suspect the reason you are hitting the mincore error is because your program attempted to run an unimplemented syscall and didn’t know what to do. If you want to go down this route, I would suggest running with a debug build of gem5 and using gdb to try and trace back where the mincore failure is coming from, but from personal experience I can tell you this is not always 100% effective. Another other option would be to use gdb in the application itself and step through it, seeing what ioctls the hipMemcpyToSymbol is using under the hood. Anyways, in gem5 you would also need to instantiate a separate constant cache and connect that to the existing memory hierarchy in the appropriate places. So, as you can probably tell, this will likely be a fairly intensive process to get working though. The alternative would be to change your program to use the data cache for the array instead of using the constant cache. This would potentially hurt the performance of the application, but wouldn’t require adding any new features to the simulator. To answer your other questions more directly: - the constant memory allocations shouldn’t go to the scalar cache or data cache. It uses a separate cache, the constant cache. If you look at slides on GCN3 (e.g., slide 23 of: https://gpuopen.com/wp-content/uploads/2019/08/RDNA_Architecture_public.pdf or Figure 1.1 in http://developer.amd.com/wordpress/media/2013/12/AMD_GCN3_Instruction_Set_Architecture_rev1.1.pdf), you’ll see a separate cache from the I$, D$, and scalar cache for constants. - See slide 60: http://www.m5sim.org/wiki/images/1/19/AMD_gem5_APU_simulator_isca_2018_gem5_wiki.pdf for the SQC explanation. Thanks, Matt On Tue, Aug 18, 2020 at 8:37 AM Daniel Gerzhoy via gem5-users < gem5-users@gem5.org> wrote: > Hey all, > > Is there a way to use constant memory in the GPU Model right now? > > Using the > > *__constant__ float variable[SIZE];* > > and > > *hipMemcpyToSymbol(...)* > > results in a > > *fatal: syscall mincore (#27) unimplemented.* > > I've been looking through the code to find a way, but I haven't yet. > I guess a clarifying question might be: which cache does constant memory > go to? the SQC? Scalar Cache? (Those two actually seem to have the same > controller) > > Thanks, > > Dan Gerzhoy > > > ___ > > gem5-users mailing list -- gem5-users@gem5.org > > To unsubscribe send an email to gem5-users-le...@gem5.org > > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s