[gem5-users] Re: GPU virtual memory system

2021-11-08 Thread Imad Al Assir via gem5-users
As a follow-up on this, I saw that in the config.dot and config.ini files 
generated in the m5out folder, there is only 1 memory controller in the case of 
dGPU but different memory pools for the CPU and dGPU.
In that case, what is the real difference between APU and dGPU?
Usually, host-to-device (and device-to-host) transfers represent a big 
overhead, so how are they modeled in this case where the transfers effectively 
happen in the same memory (across different memory pools)?

Thanks again,
Imad

On Oct 29 2021, at 6:02 pm, Imad Al Assir  wrote:
> Hello,
> I have been looking at the source code of the GPU model for the past few 
> weeks, and I had some doubts about the virtual memory system for discrete 
> GPUs (and APUs if there are any differences). I will include my questions and 
> partial answers below, and I hope you can correct me if I'm wrong. Also, it 
> would be great if you can point me to the documentation/source code where 
> each of these answers can be found.
> 1- Where are the page tables located exactly? Who manages them?
> I saw that the page tables are emulated (i.e. with the EmulationPageTable 
> structure) and that the GPU uses the host x86 page tables. But since there is 
> no OS, who manages them and where are they located exactly? In the Ruby 
> memory of the CPU?
> 2- How do page walks happen?
> I saw some comment saying that they are not real page walks, and that the 
> CPU's x86 page table walkers (PTWs) are used. But how is the translation from 
> the page table actually fetched if the walk is not real? Don't the page 
> walkers still have to walk the tables in memory?
> 3- How are page faults handled if there is no OS?
> 4- What components of the VM hierarchy are already present: IOMMU, TLBs, PWC, 
> PTWs?
> What I am sure of is that there is a customizable TLB hierarchy and TLB 
> coalescers. As for the IOMMU, I was not able to figure out what it consisted 
> of. I know that there is a PTW and that the model uses the CPU's x86 page 
> tables to do the translations. But how many PTWs are there; GPUs usually 
> require multiple PTWs, so is this number customizable? Also, I did not see 
> any page walk caches or IOMMU TLBs. Are these not present in the current 
> model? If I am wrong, please point me to the source code of each component 
> (and where they are instantiated).
>
> I saw that a paper published by AMD in the latest MICRO 
> (https://dl.acm.org/doi/10.1145/3466752.3480105) used the GPU model, and that 
> they had all of the components mentioned in question 4, so are these publicly 
> available to everyone or should I implement them myself?
> 5- I saw a comment in gpu_compute_driver.cc saying: "TODO: IOMMU and GPUTLBs 
> do not seem to correctly support shootdown". Does this mean that TLB 
> shootdown is not working at all? And when you say IOMMU, what do you mean 
> exactly (since there is no concrete IOMMU component), i.e. what does it 
> consist of?
> Sorry for the long e-mail and thank you in advance for your help,
> Imad Al Assir

___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] GPU virtual memory system

2021-10-29 Thread Imad Al Assir via gem5-users
Hello,
I have been looking at the source code of the GPU model for the past few weeks, 
and I had some doubts about the virtual memory system for discrete GPUs (and 
APUs if there are any differences). I will include my questions and partial 
answers below, and I hope you can correct me if I'm wrong. Also, it would be 
great if you can point me to the documentation/source code where each of these 
answers can be found.
1- Where are the page tables located exactly? Who manages them?
I saw that the page tables are emulated (i.e. with the EmulationPageTable 
structure) and that the GPU uses the host x86 page tables. But since there is 
no OS, who manages them and where are they located exactly? In the Ruby memory 
of the CPU?
2- How do page walks happen?
I saw some comment saying that they are not real page walks, and that the CPU's 
x86 page table walkers (PTWs) are used. But how is the translation from the 
page table actually fetched if the walk is not real? Don't the page walkers 
still have to walk the tables in memory?
3- How are page faults handled if there is no OS?
4- What components of the VM hierarchy are already present: IOMMU, TLBs, PWC, 
PTWs?
What I am sure of is that there is a customizable TLB hierarchy and TLB 
coalescers. As for the IOMMU, I was not able to figure out what it consisted 
of. I know that there is a PTW and that the model uses the CPU's x86 page 
tables to do the translations. But how many PTWs are there; GPUs usually 
require multiple PTWs, so is this number customizable? Also, I did not see any 
page walk caches or IOMMU TLBs. Are these not present in the current model? If 
I am wrong, please point me to the source code of each component (and where 
they are instantiated).

I saw that a paper published by AMD in the latest MICRO 
(https://dl.acm.org/doi/10.1145/3466752.3480105) used the GPU model, and that 
they had all of the components mentioned in question 4, so are these publicly 
available to everyone or should I implement them myself?
5- I saw a comment in gpu_compute_driver.cc saying: "TODO: IOMMU and GPUTLBs do 
not seem to correctly support shootdown". Does this mean that TLB shootdown is 
not working at all? And when you say IOMMU, what do you mean exactly (since 
there is no concrete IOMMU component), i.e. what does it consist of?
Sorry for the long e-mail and thank you in advance for your help,
Imad Al Assir
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Re: Full-system discrete GPU simulation

2021-10-01 Thread Imad Al Assir via gem5-users
Another question if I may: Is the discrete GPU option (in SE mode) supported? I 
saw an option to do it (dGPU) but I wasn't sure if it works or not. Thanks 
again,Imad
 Mensaje original De: Imad Al Assir  
Fecha: 2/10/21  0:49  (GMT+01:00) Para: "Poremba, Matthew" 
, gem5 users mailing list  
Asunto: RE: [gem5-users] Full-system discrete GPU simulation Oh okay, thanks 
for the info. Is there an expected date for the next release? Mensaje 
original De: "Poremba, Matthew"  Fecha: 
1/10/21  23:10  (GMT+01:00) Para: gem5 users mailing list  
Cc: Imad Al Assir  Asunto: RE: [gem5-users] Full-system 
discrete GPU simulation 

[AMD Official Use Only]
 
Hi Imad,
 
 
It is still not supported on stable nor develop.  We are submitting patches for 
this over time as a way to not overwhelm the (volunteer) reviewers.  It will 
most likely be supported in the next gem5 release.
 
 
-Matt
 


From: Imad Al Assir via gem5-users 

Sent: Friday, October 1, 2021 11:08 AM
To: gem5-users@gem5.org
Cc: Imad Al Assir 
Subject: [gem5-users] Full-system discrete GPU simulation


 
[CAUTION: External Email] 


Hello,

 

Is it now possible to run full-system discrete GPU simulation? I have seen some 
files about that in gem5/configs/example/gpufs in the main branch of the latest 
gem5 version. I have tried running it with the docker image of the GPU model,
 with similar commands, but it failed. I have looked at the video titled 
"Towards full-system discrete GPU simulation" which dates back to June 1st, 
2020, and there had been a lot of work in progress to support full-system
 mode.


If this mode is now supported, may you provide some documentation on how to run 
it.



If it still is not supported, may you please give updates on which stage it is 
at, and what is left to be done?

 

Many thanks in advance,


Imad

 




___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Re: Full-system discrete GPU simulation

2021-10-01 Thread Imad Al Assir via gem5-users
Oh okay, thanks for the info. Is there an expected date for the next release?
 Mensaje original De: "Poremba, Matthew" 
 Fecha: 1/10/21  23:10  (GMT+01:00) Para: gem5 users 
mailing list  Cc: Imad Al Assir  
Asunto: RE: [gem5-users] Full-system discrete GPU simulation 

[AMD Official Use Only]
 
Hi Imad,
 
 
It is still not supported on stable nor develop.  We are submitting patches for 
this over time as a way to not overwhelm the (volunteer) reviewers.  It will 
most likely be supported in the next gem5 release.
 
 
-Matt
 


From: Imad Al Assir via gem5-users 

Sent: Friday, October 1, 2021 11:08 AM
To: gem5-users@gem5.org
Cc: Imad Al Assir 
Subject: [gem5-users] Full-system discrete GPU simulation


 
[CAUTION: External Email] 


Hello,

 

Is it now possible to run full-system discrete GPU simulation? I have seen some 
files about that in gem5/configs/example/gpufs in the main branch of the latest 
gem5 version. I have tried running it with the docker image of the GPU model,
 with similar commands, but it failed. I have looked at the video titled 
"Towards full-system discrete GPU simulation" which dates back to June 1st, 
2020, and there had been a lot of work in progress to support full-system
 mode.


If this mode is now supported, may you provide some documentation on how to run 
it.



If it still is not supported, may you please give updates on which stage it is 
at, and what is left to be done?

 

Many thanks in advance,


Imad

 




___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Full-system discrete GPU simulation

2021-10-01 Thread Imad Al Assir via gem5-users
Hello,

Is it now possible to run full-system discrete GPU simulation? I have seen some 
files about that in gem5/configs/example/gpufs in the main branch of the latest 
gem5 version. I have tried running it with the docker image of the GPU model, 
with similar commands, but it failed. I have looked at the video titled 
"Towards full-system discrete GPU simulation" which dates back to June 1st, 
2020, and there had been a lot of work in progress to support full-system mode.
If this mode is now supported, may you provide some documentation on how to run 
it.
If it still is not supported, may you please give updates on which stage it is 
at, and what is left to be done?

Many thanks in advance,
Imad

___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Re: gem5 GCN GPU docker error

2021-09-22 Thread Imad Al Assir via gem5-users
Dear Matt,

Many thanks for catching this error! It did indeed solve the problem; I was 
able to successfully run square and other applications from hip-samples on 
both, the manually built dockerfile with everything related to rocBLAS and 
MIOpen commented, and the pre-built docker image which I believe has rocBLAS 
and MIOpen installed (based on its size).
Many thanks again,
Imad

On Sep 22 2021, at 6:48 pm, Poremba, Matthew  wrote:
>
> [AMD Official Use Only]
>
>
> Hi Imad,
>
>
> Yes, the docker seems to have broken in the past few days.
>
> Regarding the benchmark not completing, please change your command to use 3 
> CPUs:
>
>
> docker run --rm -v $PWD/gem5:/gem5 -v $PWD/gem5-resources:/gem5-resources \
> -w /gem5 gcr.io/gem5-test/gcn-gpu \
> build/GCN3_X86/gem5.opt configs/example/apu_se.py -n3 \
> --benchmark-root=/gem5-resources/src/gpu/square/bin \
> -c square
>
> ROCm 4.0 requires 3 CPUs to run now. I thought we had updated the README.md 
> and website before gem5 21.1 release to reflect this but looks like they are 
> not up to date.
>
>
> -Matt
>
> From: Imad Al Assir via gem5-users 
> Sent: Wednesday, September 22, 2021 9:31 AM
> To: Matt Sinclair 
> Cc: gem5 users mailing list ; Kyle Roarty 
> ; Imad Al Assir 
> Subject: [gem5-users] Re: gem5 GCN GPU docker error
>
>
>
>
>
> [CAUTION: External Email]
> Hello,
> Thank you for your reply. I was simply following the documentation on the 
> gem5 website: https://www.gem5.org/documentation/general_docs/gpu_models/GCN3 
> (https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.gem5.org%2Fdocumentation%2Fgeneral_docs%2Fgpu_models%2FGCN3=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172742925%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=izNVhdZSvEH7gisG849pkXAdKu2MtDMOt3aBbn9J26o%3D=0)
> In other words, to build the image, I used:
>
>
> docker build -t gcn-gpu .
>
>
>
> This command didn't complete and was interrupted by the error I pasted in the 
> previous mail.
>
>
> I was also using the command in the documentation to compile square:
> docker run --rm -v $PWD/gem5-resources:$PWD/gem5-resources -w 
> $PWD/gem5-resources/src/gpu/square gcr.io/gem5-test/gcn-gpu make square
>
>
>
> NOT "make gfx8-apu", as written in the documentation, which caused an error: 
> "no rule to make target 'gfx8-apu' ", and I assumed was a typo.
>
>
> To run it, I also used the command in the doc:
> docker run --rm -v $PWD/gem5:/gem5 -v $PWD/gem5-resources:/gem5-resources \
>
> -w /gem5 gcr.io/gem5-test/gcn-gpu \
>
> build/GCN3_X86/gem5.opt configs/example/apu_se.py -n2 \
>
> --benchmark-root=/gem5-resources/src/gpu/square/bin \
>
> -c square
>
>
>
> Note that in these commands, I modified the path of square to 
> 'gem5-resources/src/gpu/square' instead of 'gem5-resources/src/square', 
> because that's where I found the code for it.
> Also note that I tried downloading the pre-built binary of square (from the 
> gem5-resources website: http://resources.gem5.org/README 
> (https://nam11.safelinks.protection.outlook.com/?url=http%3A%2F%2Fresources.gem5.org%2FREADME=04%7C01%7Cmatthew.poremba%40amd.com%7C2675554a18524cefdd0008d97de67d9b%7C3dd8961fe4884e608e11a82d994e183d%7C0%7C0%7C637679251172752910%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1000=aoZN7pZU%2Be9m0dvaemraGLb0MEulGMRH%2FVExbRdyllI%3D=0)),
>  but the result was the same: application running indefinitely.
>
>
>
> Thanks again for your help,
> Imad
>
>
>
> PS: If it helps, here are the last things printed when running square in gem5 
> in the pre-built docker image:
>
>
> [...] just warnings
>
>
> gem5 Simulator System. http://gem5.org
> gem5 is copyrighted software; use the --copyright option for details.
>
>
>
> gem5 version 21.1.0.1
> gem5 compiled Sep 21 2021 14:52:55
>
> gem5 started Sep 22 2021 15:26:26
>
> gem5 executing on 8d532399b09e, pid 1
>
> command line: build/GCN3_X86/gem5.opt configs/example/apu_se.py -n2 
> --benchmark-root=/gem5-resources/src/gpu/square/bin -c square
>
>
>
> info: Standard input is not a terminal, disabling listeners.
> Num SQC = 1 Num scalar caches = 1 Num CU = 4
>
> coalescer.slave is deprecated. `slave` is now called `in_ports`
>
> warn: coalescer.slave is deprecated. `slave` is now called `in_ports`
>
> warn: coalescer.slave is deprecated. `slave` is now called `in_ports`
>
>
>
> [...] same warning as the one right above this line, repeated multiple times
>
>
> warn: system.ruby.

[gem5-users] Re: gem5 GCN GPU docker error

2021-09-22 Thread Imad Al Assir via gem5-users
 warn: unimplemented 
ioctl: AMDKFD_IOC_SET_TRAP_HANDLER
info: running on device
info: architecture on AMD GPU device is: 801
info: allocate host and device mem ( 7.63 MB)
info: launch 'vector_square' kernel
build/GCN3_X86/sim/syscall_emul.cc:84: warn: ignoring syscall sched_yield(...)
(further warnings will be suppressed)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)
build/GCN3_X86/sim/syscall_emul.cc:73: warn: ignoring syscall mprotect(...)

On Sep 22 2021, at 5:17 pm, Matt Sinclair  wrote:
> Hi Imad,
>
> I just built the docker earlier this week and did not have any problems 
> (e.g., I ran square and it completed in < 2 hours). How are you trying to 
> build it? And how are you running the applications you mentioned?
>
> Thanks,
> Matt
>
>
>
> On Wed, Sep 22, 2021 at 12:31 AM Imad Al Assir via gem5-users 
> mailto:gem5-users@gem5.org)> wrote:
> > Hello,
> > Is there a problem with the most recent gcn-gpu docker file?
> > I tried building it several times on Ubuntu 20.04 and 18.04 but it kept 
> > giving me this error:
> >
> > [...]
> > Unpacking rocblas (2.32.0-cc18d25f) ...
> > dpkg: dependency problems prevent configuration of rocblas:
> > rocblas depends on rocm-core; however:
> > Package rocm-core is not installed.
> >
> > dpkg: error processing package rocblas (--install):
> > dependency problems - leaving unconfigured
> > dpkg: dependency problems prevent configuration of rocblas-dev:
> > rocblas-dev depends on rocblas (>= 2.32.0); however:
> > Package rocblas is not configured yet.
> >
> > dpkg: error processing package rocblas-dev (--install):
> > dependency problems - leaving unconfigured
> > Errors were encountered while processing:
> > rocblas
> > rocblas-dev
> > + check_exit_code 1
> > + (( 1 != 0 ))
> > + exit 1
> > The command '/bin/sh -c ./install.sh -d -a all -i' returned a non-zero 
> > code: 1
> >
> > I also tried downloading the pre-built docker image 
> > (gcr.io/gem5-test/gcn-gpu (http://gcr.io/gem5-test/gcn-gpu)) and built gem5 
> > supposedly with no errors (but with a warning about deprecated namespaces 
> > not being supported by the compiler). Then when I tried running the 
> > 'square' sample application and other ones from 
> > gem5-resources/src/gpu/hip-samples (e.g. MatrixTranspose, dynamic_shared, 
> > inline_asm, etc.), they just kept running indefinitely (> 2 hours), and I 
> > had to kill them to stop them.
> > May you please try building the latest version of the gcn-gpu dockerfile 
> > and/or running a sample application on the pre-built docker image, and 
> > inform us if it works, and if not, how to fix the problem?
> > Thanks in advance,
> > Imad Al Assir
> > ___
> > gem5-users mailing list -- gem5-users@gem5.org (mailto:gem5-users@gem5.org)
> > To unsubscribe send an email to gem5-users-le...@gem5.org 
> > (mailto:gem5-users-le...@gem5.org)
> > %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>
>

___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] gem5 GCN GPU docker error

2021-09-21 Thread Imad Al Assir via gem5-users
Hello,
Is there a problem with the most recent gcn-gpu docker file?
I tried building it several times on Ubuntu 20.04 and 18.04 but it kept giving 
me this error:

[...]
Unpacking rocblas (2.32.0-cc18d25f) ...
dpkg: dependency problems prevent configuration of rocblas:
rocblas depends on rocm-core; however:
Package rocm-core is not installed.

dpkg: error processing package rocblas (--install):
dependency problems - leaving unconfigured
dpkg: dependency problems prevent configuration of rocblas-dev:
rocblas-dev depends on rocblas (>= 2.32.0); however:
Package rocblas is not configured yet.

dpkg: error processing package rocblas-dev (--install):
dependency problems - leaving unconfigured
Errors were encountered while processing:
rocblas
rocblas-dev
+ check_exit_code 1
+ (( 1 != 0 ))
+ exit 1
The command '/bin/sh -c ./install.sh -d -a all -i' returned a non-zero code: 1

I also tried downloading the pre-built docker image (gcr.io/gem5-test/gcn-gpu) 
and built gem5 supposedly with no errors (but with a warning about deprecated 
namespaces not being supported by the compiler). Then when I tried running the 
'square' sample application and other ones from 
gem5-resources/src/gpu/hip-samples (e.g. MatrixTranspose, dynamic_shared, 
inline_asm, etc.), they just kept running indefinitely (> 2 hours), and I had 
to kill them to stop them.
May you please try building the latest version of the gcn-gpu dockerfile and/or 
running a sample application on the pre-built docker image, and inform us if it 
works, and if not, how to fix the problem?
Thanks in advance,
Imad Al Assir
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s