[gem5-users] Re: GPU-FS simulation progress
Oh I understand now, you are both right. The confusion (with me) was that I'd seen a "PASSED" printed on stdout (in SE) -- indicating expected (successful) termination of the application I was running. I didn't realise the prints (in FS) on stdout did not also include application-prints. I do in fact see PASSED in "m5out/system.pc.com_1.device" along with my redirects and suchlike. Thank you so much for all the help! The FS mode works in all regards. On Fri, Jun 23, 2023 at 6:57 PM Poremba, Matthew wrote: > [Public] > > For some reason I cannot see the original email Matt is replying to, but > m5_exit is the normal exit status. > > > > Just as a reminder, the gem5 output does not give any indication about > whether or not your application running **in** gem5 completed > successfully. You will need to check the terminal output (i.e., the output > of the system being simulated). By default this is > “m5out/system.pc.com_1.device”. The terminal output in full system (FS) > mode is not concatenated with gem5 output as it is in system emulation (SE) > mode. > > > > > > -Matt > > > > *From:* Matt Sinclair > *Sent:* Friday, June 23, 2023 9:40 AM > *To:* The gem5 Users mailing list > *Cc:* Anoop Mysore ; Poremba, Matthew < > matthew.pore...@amd.com> > *Subject:* Re: [gem5-users] Re: GPU-FS simulation progress > > > > *Caution:* This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > > Maybe I'm missing something, but where in that set of prints is the > error? At the end I see this: > > > > Exiting @ tick 2581705103 because m5_exit instruction encountered > > > > Which is the normal thing to see when gem5 exists. > > > > Matt > > > > On Fri, Jun 23, 2023 at 4:06 AM Anoop Mysore via gem5-users < > gem5-users@gem5.org> wrote: > > Reviving a previous thread: > > https://www.mail-archive.com/gem5-users@gem5.org/msg21015.html > > > > I am facing the same exact error, almost the same processor -- AMD Ryzen > 5800 HS (laptop). > > However, for the OP, moving to a faster EPYC worked. I was able to move > only to a (desktop) Intel i7-8700 CPU @ 3.20GHz. Here's a couple lines of > progress, but it seems to fail too. > I am on ROCm v4.0.1 -- compiled sqare test with that. Tried both locally > built disk-image, and a downloaded image, and downloaded kernel. > > > > Here's the concise gem5 log (without debug flags -- that's attached as a > file: > > > ___ > > gem5 Simulator System. https://www.gem5.org > gem5 is copyrighted software; use the --copyright option for details. > > gem5 version 22.1.0.0 > gem5 compiled Jun 22 2023 18:46:21 > gem5 started Jun 23 2023 10:24:44 > gem5 executing on ashkan-asgharzadeh, pid 24266 > command line: gem5/build/VEGA_X86/gem5.opt > gem5/configs/example/gpufs/vega10_kvm.py --disk-image > gem5-resources/src/gpu-fs/disk-image/rocm42/rocm42-image/rocm42 --kernel > gem5-resources/src/gpu-fs/vmlinux-5.4.0-105-generic --gpu-mmio-trace > gem5-resources/src/gpu-fs/mmio_trace.log --app > gem5-resources/src/gpu/square/bin/square > > warn: Memory mode will be changed to atomic_noncaching > warn: The `get_runtime_isa` function is deprecated. Please migrate away > from using this function. > Global frequency set at 1 ticks per second > build/VEGA_X86/mem/dram_interface.cc:692: warn: DRAM device capacity (8192 > Mbytes) does not match the address range assigned (4096 Mbytes) > build/VEGA_X86/sim/kernel_workload.cc:46: info: kernel located at: > gem5-resources/src/gpu-fs/vmlinux-5.4.0-105-generic > build/VEGA_X86/base/stats/storage.hh:282: warn: Bucket size (5) does not > divide range [1:75] into equal-sized buckets. Rounding up. > ... > > ... > > build/VEGA_X86/mem/dram_interface.cc:692: warn: DRAM device capacity (128 > Mbytes) does not match the address range assigned (16384 Mbytes) > build/VEGA_X86/base/statistics.hh:280: warn: One of the stats is a legacy > stat. Legacy stat is a stat that does not belong to any statistics::Group. > Legacy stat is deprecated. > 0: system.pc.south_bridge.cmos.rtc: Real-time clock set to Sun Jan > 1 00:00:00 2012 > system.pc.com_1.device: Listening for connections on port 3456 > build/VEGA_X86/base/statistics.hh:280: warn: One of the stats is a legacy > stat. Legacy stat is a stat that does not belong to any statistics::Group. > Legacy stat is deprecated. > 0: system.remote_gdb: listening for remote gdb on port 7000 > build/VEGA_X86/dev/intel_8254_timer.cc:128: warn: Read
[gem5-users] Re: GPU-FS simulation progress
[Public] For some reason I cannot see the original email Matt is replying to, but m5_exit is the normal exit status. Just as a reminder, the gem5 output does not give any indication about whether or not your application running *in* gem5 completed successfully. You will need to check the terminal output (i.e., the output of the system being simulated). By default this is “m5out/system.pc.com_1.device”. The terminal output in full system (FS) mode is not concatenated with gem5 output as it is in system emulation (SE) mode. -Matt From: Matt Sinclair Sent: Friday, June 23, 2023 9:40 AM To: The gem5 Users mailing list Cc: Anoop Mysore ; Poremba, Matthew Subject: Re: [gem5-users] Re: GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Maybe I'm missing something, but where in that set of prints is the error? At the end I see this: Exiting @ tick 2581705103 because m5_exit instruction encountered Which is the normal thing to see when gem5 exists. Matt On Fri, Jun 23, 2023 at 4:06 AM Anoop Mysore via gem5-users mailto:gem5-users@gem5.org>> wrote: Reviving a previous thread: https://www.mail-archive.com/gem5-users@gem5.org/msg21015.html I am facing the same exact error, almost the same processor -- AMD Ryzen 5800 HS (laptop). However, for the OP, moving to a faster EPYC worked. I was able to move only to a (desktop) Intel i7-8700 CPU @ 3.20GHz. Here's a couple lines of progress, but it seems to fail too. I am on ROCm v4.0.1 -- compiled sqare test with that. Tried both locally built disk-image, and a downloaded image, and downloaded kernel. Here's the concise gem5 log (without debug flags -- that's attached as a file: ___ gem5 Simulator System. https://www.gem5.org<https://www.gem5.org/> gem5 is copyrighted software; use the --copyright option for details. gem5 version 22.1.0.0 gem5 compiled Jun 22 2023 18:46:21 gem5 started Jun 23 2023 10:24:44 gem5 executing on ashkan-asgharzadeh, pid 24266 command line: gem5/build/VEGA_X86/gem5.opt gem5/configs/example/gpufs/vega10_kvm.py --disk-image gem5-resources/src/gpu-fs/disk-image/rocm42/rocm42-image/rocm42 --kernel gem5-resources/src/gpu-fs/vmlinux-5.4.0-105-generic --gpu-mmio-trace gem5-resources/src/gpu-fs/mmio_trace.log --app gem5-resources/src/gpu/square/bin/square warn: Memory mode will be changed to atomic_noncaching warn: The `get_runtime_isa` function is deprecated. Please migrate away from using this function. Global frequency set at 1 ticks per second build/VEGA_X86/mem/dram_interface.cc:692: warn: DRAM device capacity (8192 Mbytes) does not match the address range assigned (4096 Mbytes) build/VEGA_X86/sim/kernel_workload.cc:46: info: kernel located at: gem5-resources/src/gpu-fs/vmlinux-5.4.0-105-generic build/VEGA_X86/base/stats/storage.hh:282: warn: Bucket size (5) does not divide range [1:75] into equal-sized buckets. Rounding up. ... ... build/VEGA_X86/mem/dram_interface.cc:692: warn: DRAM device capacity (128 Mbytes) does not match the address range assigned (16384 Mbytes) build/VEGA_X86/base/statistics.hh:280: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated. 0: system.pc.south_bridge.cmos.rtc: Real-time clock set to Sun Jan 1 00:00:00 2012 system.pc.com_1.device: Listening for connections on port 3456 build/VEGA_X86/base/statistics.hh:280: warn: One of the stats is a legacy stat. Legacy stat is a stat that does not belong to any statistics::Group. Legacy stat is deprecated. 0: system.remote_gdb: listening for remote gdb on port 7000 build/VEGA_X86/dev/intel_8254_timer.cc:128: warn: Reading current count from inactive timer. tcmalloc: large alloc 2147483648 bytes == 0x562c3a2f6000 @ 0x7f421c1cd887 0x562c18bc2019 0x562c18106aa6 0x562c17dd474f 0x7f421ca5758a 0x7f421c9bfec8 0x7f421c9c6303 0x7f421c9be803 0x7f421c9c02aa 0x7f421c9c6303 0x7f421c9be803 0x7f421c9c02be 0x7f421c9c6303 0x7f421c9bfa0f 0x7f421c9c04ce 0x7f421c9c124b 0x7f421c9cc55d 0x7f421ca5753b 0x7f421c9c01ec 0x7f421c9c6303 0x7f421c9bfa0f 0x7f421c9c04ce 0x7f421ca7fd6b 0x7f421caab768 0x562c17e5aca9 0x562c17d4c7e0 0x7f421a437c87 0x562c17dc531a Running the simulation build/VEGA_X86/cpu/kvm/base.cc:150: info: KVM: Coalesced MMIO disabled by config. build/VEGA_X86/arch/x86/cpuid.cc:181: warn: x86 cpuid family 0x: unimplemented function 2 build/VEGA_X86/sim/simulate.cc:192: info: Entering event queue @ 0. Starting simulation... ... build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1563: warn: kvm-x86: MSR (0x4b564d05) unsupported by gem5. Skipping. build/VEGA_X86/dev/x86/pc.cc:117: warn: Don't know what interrupt to clear for console. build/VEGA_X86/dev/amdgpu/amdgpu_vm.hh:240: warn: Accessing unsupported MMIO a
[gem5-users] Re: GPU-FS simulation progress
Maybe I'm missing something, but where in that set of prints is the error? At the end I see this: Exiting @ tick 2581705103 because m5_exit instruction encountered Which is the normal thing to see when gem5 exists. Matt On Fri, Jun 23, 2023 at 4:06 AM Anoop Mysore via gem5-users < gem5-users@gem5.org> wrote: > Reviving a previous thread: > https://www.mail-archive.com/gem5-users@gem5.org/msg21015.html > > I am facing the same exact error, almost the same processor -- AMD Ryzen > 5800 HS (laptop). > However, for the OP, moving to a faster EPYC worked. I was able to move > only to a (desktop) Intel i7-8700 CPU @ 3.20GHz. Here's a couple lines of > progress, but it seems to fail too. > I am on ROCm v4.0.1 -- compiled sqare test with that. Tried both locally > built disk-image, and a downloaded image, and downloaded kernel. > > Here's the concise gem5 log (without debug flags -- that's attached as a > file: > > ___ > gem5 Simulator System. https://www.gem5.org > gem5 is copyrighted software; use the --copyright option for details. > > gem5 version 22.1.0.0 > gem5 compiled Jun 22 2023 18:46:21 > gem5 started Jun 23 2023 10:24:44 > gem5 executing on ashkan-asgharzadeh, pid 24266 > command line: gem5/build/VEGA_X86/gem5.opt > gem5/configs/example/gpufs/vega10_kvm.py --disk-image > gem5-resources/src/gpu-fs/disk-image/rocm42/rocm42-image/rocm42 --kernel > gem5-resources/src/gpu-fs/vmlinux-5.4.0-105-generic --gpu-mmio-trace > gem5-resources/src/gpu-fs/mmio_trace.log --app > gem5-resources/src/gpu/square/bin/square > > warn: Memory mode will be changed to atomic_noncaching > warn: The `get_runtime_isa` function is deprecated. Please migrate away > from using this function. > Global frequency set at 1 ticks per second > build/VEGA_X86/mem/dram_interface.cc:692: warn: DRAM device capacity (8192 > Mbytes) does not match the address range assigned (4096 Mbytes) > build/VEGA_X86/sim/kernel_workload.cc:46: info: kernel located at: > gem5-resources/src/gpu-fs/vmlinux-5.4.0-105-generic > build/VEGA_X86/base/stats/storage.hh:282: warn: Bucket size (5) does not > divide range [1:75] into equal-sized buckets. Rounding up. > ... > ... > build/VEGA_X86/mem/dram_interface.cc:692: warn: DRAM device capacity (128 > Mbytes) does not match the address range assigned (16384 Mbytes) > build/VEGA_X86/base/statistics.hh:280: warn: One of the stats is a legacy > stat. Legacy stat is a stat that does not belong to any statistics::Group. > Legacy stat is deprecated. > 0: system.pc.south_bridge.cmos.rtc: Real-time clock set to Sun Jan > 1 00:00:00 2012 > system.pc.com_1.device: Listening for connections on port 3456 > build/VEGA_X86/base/statistics.hh:280: warn: One of the stats is a legacy > stat. Legacy stat is a stat that does not belong to any statistics::Group. > Legacy stat is deprecated. > 0: system.remote_gdb: listening for remote gdb on port 7000 > build/VEGA_X86/dev/intel_8254_timer.cc:128: warn: Reading current count > from inactive timer. > tcmalloc: large alloc 2147483648 bytes == 0x562c3a2f6000 @ 0x7f421c1cd887 > 0x562c18bc2019 0x562c18106aa6 0x562c17dd474f 0x7f421ca5758a 0x7f421c9bfec8 > 0x7f421c9c6303 0x7f421c9be803 0x7f421c9c02aa 0x7f421c9c6303 0x7f421c9be803 > 0x7f421c9c02be 0x7f421c9c6303 0x7f421c9bfa0f 0x7f421c9c04ce 0x7f421c9c124b > 0x7f421c9cc55d 0x7f421ca5753b 0x7f421c9c01ec 0x7f421c9c6303 0x7f421c9bfa0f > 0x7f421c9c04ce 0x7f421ca7fd6b 0x7f421caab768 0x562c17e5aca9 0x562c17d4c7e0 > 0x7f421a437c87 0x562c17dc531a > Running the simulation > build/VEGA_X86/cpu/kvm/base.cc:150: info: KVM: Coalesced MMIO disabled by > config. > build/VEGA_X86/arch/x86/cpuid.cc:181: warn: x86 cpuid family 0x: > unimplemented function 2 > > build/VEGA_X86/sim/simulate.cc:192: info: Entering event queue @ 0. > Starting simulation... > ... > build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1563: warn: kvm-x86: MSR > (0x4b564d05) unsupported by gem5. Skipping. > build/VEGA_X86/dev/x86/pc.cc:117: warn: Don't know what interrupt to clear > for console. > build/VEGA_X86/dev/amdgpu/amdgpu_vm.hh:240: warn: Accessing unsupported > MMIO aperture! Assuming NBIO > Exiting @ tick 2581705103 because m5_exit instruction encountered > build/VEGA_X86/cpu/kvm/base.cc:572: hack: Pretending totalOps is > equivalent to totalInsts() > ___ > gem5-users mailing list -- gem5-users@gem5.org > To unsubscribe send an email to gem5-users-le...@gem5.org > ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org
[gem5-users] Re: GPU-FS simulation progress
[AMD Official Use Only - General] I see, thanks again. The verification should have passed it that case... The docker image is *only* for building applications, so that the user does not need to install ROCm locally to build applications. You do *not* run gem5 simulations in a docker for full system GPU. -Matt From: Rajesh Shashi Kumar Sent: Wednesday, December 7, 2022 11:32 AM To: Poremba, Matthew Cc: The gem5 Users mailing list Subject: Re: [gem5-users] GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. By older ROCm I was referring to this v4.2 which is mentioned in the gem5-resources documentation. I used the following command to build square. Not sure if this makes a difference, but do you run the simulation inside the docker image or on the host machine? docker run --rm -v ${PWD}:${PWD} -w ${PWD} gcr.io/gem5-test/gpu-fs:latest bash -c 'make clean; HCC_AMDGPU_TARGET=gfx900 make' On Wed, Dec 7, 2022 at 11:03 AM Poremba, Matthew mailto:matthew.pore...@amd.com>> wrote: [AMD Official Use Only - General] Thanks Rajesh, That is good to know. I don't think there is an list anywhere of which CPUs work with KVM. Which older ROCm do you mean here? Was square compiled with an older version? Ideally the verification should be passing as well. At least, it does on my local setup so it would be difficult for me to debug why it does not work for other folks. -Matt From: Rajesh Shashi Kumar mailto:reachrajesh...@gmail.com>> Sent: Tuesday, December 6, 2022 5:49 PM To: Poremba, Matthew mailto:matthew.pore...@amd.com>> Cc: The gem5 Users mailing list mailto:gem5-users@gem5.org>> Subject: Re: [gem5-users] GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Okay turns out the issue was indeed using a slow local machine somehow (AMD Ryzen 7 5800H) I ran the same thing on a "AMD EPYC 7451 24-Core Processor x2" and I am able to run square now within 10minutes or so. I guess the last two lines could be because of using an older rocm version. Running ../../gpu/square/bin/square info: running on device Vega 10 XTX [Radeon Vega Frontier Edition] info: architecture on AMD GPU device is: 900 info: allocate host and device mem ( 7.63 MB) info: launch 'vector_square' kernel info: check result error: 'hipErrorUnknown'(999) at square.cpp:82 ./script.sh: line 13: 548 Segmentation fault ./myapp Thank you again for your time on this! -- Rajesh Shashi Kumar On Tue, Dec 6, 2022 at 7:39 PM Rajesh Shashi Kumar mailto:reachrajesh...@gmail.com>> wrote: Thank you for your time. I tried using the provided example for booting Ubuntu from a disk-image. ./build/X86/gem5.opt configs/example/gem5_library/x86-ubuntu-run.py With this, I see that the boot did complete with this example and kvm-ok returns as expected on my machine. Also, I should mention that I'm using the pre-compiled image/kernel for GPU-FS to rule out any uncertainty there. Term output: Welcome to Ubuntu 18.04.2 LTS! systemd[1]: Set hostname to . systemd[1]: File /lib/systemd/system/systemd-journald.service:36 configures an IP firewall (IPAddressDeny=any), but the local system does not support BPF/cgroup based firewalling. systemd[1]: Proceeding WITHOUT firewalling in effect! (This warning is only shown for the first loaded unit using IP firewalling.) random: systemd: uninitialized urandom read (16 bytes read) systemd[1]: Reached target Remote File Systems. [ OK ] Reached target Remote File Systems. random: systemd: uninitialized urandom read (16 bytes read) systemd[1]: Created slice System Slice. [ OK ] Created slice System Slice. ... On Tue, Dec 6, 2022 at 6:30 PM Poremba, Matthew mailto:matthew.pore...@amd.com>> wrote: [AMD Official Use Only - General] At this point I would check if the other KVM scripts are working for you (there are some simple tests somewhere like boot Ubuntu and exit). KVM works on some CPUs better than others, I believe, or at least this was true in the past. I have a few other ideas to try, but I would like to see if any other scripts are working first and understand your setup to see if other folks might run into the same issue in the future. -Matt From: Rajesh Shashi Kumar mailto:reachrajesh...@gmail.com>> Sent: Tuesday, December 6, 2022 4:09 PM To: Poremba, Matthew mailto:matthew.pore...@amd.com>> Cc: The gem5 Users mailing list mailto:gem5-users@gem5.org>> Subject: Re: [gem5-users] GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Thank you for your response. I double checked my image and kernel, I don't think KVM is hanging but the progress seems to be a character printed on the term every once in a while. I assume this is even before it
[gem5-users] Re: GPU-FS simulation progress
By older ROCm I was referring to this v4.2 which is mentioned in the gem5-resources documentation. I used the following command to build square. Not sure if this makes a difference, but do you run the simulation inside the docker image or on the host machine? docker run --rm -v ${PWD}:${PWD} -w ${PWD} gcr.io/gem5-test/gpu-fs:latest bash -c 'make clean; HCC_AMDGPU_TARGET=gfx900 make' On Wed, Dec 7, 2022 at 11:03 AM Poremba, Matthew wrote: > [AMD Official Use Only - General] > > > > Thanks Rajesh, > > > > That is good to know. I don’t think there is an list anywhere of which > CPUs work with KVM. > > > > Which older ROCm do you mean here? Was square compiled with an older > version? Ideally the verification should be passing as well. At least, it > does on my local setup so it would be difficult for me to debug why it does > not work for other folks. > > > > > > -Matt > > > > *From:* Rajesh Shashi Kumar > *Sent:* Tuesday, December 6, 2022 5:49 PM > *To:* Poremba, Matthew > *Cc:* The gem5 Users mailing list > *Subject:* Re: [gem5-users] GPU-FS simulation progress > > > > *Caution:* This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > > Okay turns out the issue was indeed using a slow local machine somehow > (AMD Ryzen 7 5800H) > > I ran the same thing on a "AMD EPYC 7451 24-Core Processor x2" and I am > able to run square now within 10minutes or so. I guess the last two lines > could be because of using an older rocm version. > > Running ../../gpu/square/bin/square > info: running on device Vega 10 XTX [Radeon Vega Frontier Edition] > info: architecture on AMD GPU device is: 900 > info: allocate host and device mem ( 7.63 MB) > info: launch 'vector_square' kernel > info: check result > > > *error: 'hipErrorUnknown'(999) at square.cpp:82 ./script.sh: line 13: > 548 Segmentation fault ./myapp * > Thank you again for your time on this! > > -- > Rajesh Shashi Kumar > > > > On Tue, Dec 6, 2022 at 7:39 PM Rajesh Shashi Kumar < > reachrajesh...@gmail.com> wrote: > > Thank you for your time. I tried using the provided example for booting > Ubuntu from a disk-image. > > *./build/X86/gem5.opt configs/example/gem5_library/x86-ubuntu-run.py* > > With this, I see that the boot did complete with this example and *kvm-ok* > returns as expected on my machine. > Also, I should mention that I'm using the pre-compiled image/kernel for > GPU-FS to rule out any uncertainty there. > > Term output: > Welcome to Ubuntu 18.04.2 LTS! > > systemd[1]: Set hostname to . > systemd[1]: File /lib/systemd/system/systemd-journald.service:36 > configures an IP firewall (IPAddressDeny=any), but the local system does > not support BPF/cgroup based firewalling. > systemd[1]: Proceeding WITHOUT firewalling in effect! (This warning is > only shown for the first loaded unit using IP firewalling.) > random: systemd: uninitialized urandom read (16 bytes read) > systemd[1]: Reached target Remote File Systems. > [ OK ] Reached target Remote File Systems. > random: systemd: uninitialized urandom read (16 bytes read) > systemd[1]: Created slice System Slice. > [ OK ] Created slice System Slice. > ... > > > > On Tue, Dec 6, 2022 at 6:30 PM Poremba, Matthew > wrote: > > [AMD Official Use Only - General] > > > > At this point I would check if the other KVM scripts are working for you > (there are some simple tests somewhere like boot Ubuntu and exit). KVM > works on some CPUs better than others, I believe, or at least this was true > in the past. I have a few other ideas to try, but I would like to see if > any other scripts are working first and understand your setup to see if > other folks might run into the same issue in the future. > > > > > > -Matt > > > > *From:* Rajesh Shashi Kumar > *Sent:* Tuesday, December 6, 2022 4:09 PM > *To:* Poremba, Matthew > *Cc:* The gem5 Users mailing list > *Subject:* Re: [gem5-users] GPU-FS simulation progress > > > > *Caution:* This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > > Thank you for your response. > > I double checked my image and kernel, I don't think KVM is hanging but the > progress seems to be a character printed on the term every once in a while. > I assume this is even before it could finish booting. Not sure if > fastfoward could help here > > My term output: > m5 terminal: Terminal 0 > [0.00] Linux version > > > Thanks, > Rajesh > > > > On Tue, Dec 6, 2022 at 4:20 PM Poremba, Matthew > wrote: > > [AMD Official Use Only - General] > > Hi Rajesh, > > > I looks like no progress has been made since a very early tick number (the > timestamp print by Linux is equal to the current simulation tick / 1 > trillion). For reference it should take no more than 1-3 wall clock minutes > to full boot Linux and begin running the application with the KVM CPU. I > have seen fairly rarely where the KVM simply hangs
[gem5-users] Re: GPU-FS simulation progress
[AMD Official Use Only - General] Thanks Rajesh, That is good to know. I don't think there is an list anywhere of which CPUs work with KVM. Which older ROCm do you mean here? Was square compiled with an older version? Ideally the verification should be passing as well. At least, it does on my local setup so it would be difficult for me to debug why it does not work for other folks. -Matt From: Rajesh Shashi Kumar Sent: Tuesday, December 6, 2022 5:49 PM To: Poremba, Matthew Cc: The gem5 Users mailing list Subject: Re: [gem5-users] GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Okay turns out the issue was indeed using a slow local machine somehow (AMD Ryzen 7 5800H) I ran the same thing on a "AMD EPYC 7451 24-Core Processor x2" and I am able to run square now within 10minutes or so. I guess the last two lines could be because of using an older rocm version. Running ../../gpu/square/bin/square info: running on device Vega 10 XTX [Radeon Vega Frontier Edition] info: architecture on AMD GPU device is: 900 info: allocate host and device mem ( 7.63 MB) info: launch 'vector_square' kernel info: check result error: 'hipErrorUnknown'(999) at square.cpp:82 ./script.sh: line 13: 548 Segmentation fault ./myapp Thank you again for your time on this! -- Rajesh Shashi Kumar On Tue, Dec 6, 2022 at 7:39 PM Rajesh Shashi Kumar mailto:reachrajesh...@gmail.com>> wrote: Thank you for your time. I tried using the provided example for booting Ubuntu from a disk-image. ./build/X86/gem5.opt configs/example/gem5_library/x86-ubuntu-run.py With this, I see that the boot did complete with this example and kvm-ok returns as expected on my machine. Also, I should mention that I'm using the pre-compiled image/kernel for GPU-FS to rule out any uncertainty there. Term output: Welcome to Ubuntu 18.04.2 LTS! systemd[1]: Set hostname to . systemd[1]: File /lib/systemd/system/systemd-journald.service:36 configures an IP firewall (IPAddressDeny=any), but the local system does not support BPF/cgroup based firewalling. systemd[1]: Proceeding WITHOUT firewalling in effect! (This warning is only shown for the first loaded unit using IP firewalling.) random: systemd: uninitialized urandom read (16 bytes read) systemd[1]: Reached target Remote File Systems. [ OK ] Reached target Remote File Systems. random: systemd: uninitialized urandom read (16 bytes read) systemd[1]: Created slice System Slice. [ OK ] Created slice System Slice. ... On Tue, Dec 6, 2022 at 6:30 PM Poremba, Matthew mailto:matthew.pore...@amd.com>> wrote: [AMD Official Use Only - General] At this point I would check if the other KVM scripts are working for you (there are some simple tests somewhere like boot Ubuntu and exit). KVM works on some CPUs better than others, I believe, or at least this was true in the past. I have a few other ideas to try, but I would like to see if any other scripts are working first and understand your setup to see if other folks might run into the same issue in the future. -Matt From: Rajesh Shashi Kumar mailto:reachrajesh...@gmail.com>> Sent: Tuesday, December 6, 2022 4:09 PM To: Poremba, Matthew mailto:matthew.pore...@amd.com>> Cc: The gem5 Users mailing list mailto:gem5-users@gem5.org>> Subject: Re: [gem5-users] GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Thank you for your response. I double checked my image and kernel, I don't think KVM is hanging but the progress seems to be a character printed on the term every once in a while. I assume this is even before it could finish booting. Not sure if fastfoward could help here My term output: m5 terminal: Terminal 0 [0.00] Linux version Thanks, Rajesh On Tue, Dec 6, 2022 at 4:20 PM Poremba, Matthew mailto:matthew.pore...@amd.com>> wrote: [AMD Official Use Only - General] Hi Rajesh, I looks like no progress has been made since a very early tick number (the timestamp print by Linux is equal to the current simulation tick / 1 trillion). For reference it should take no more than 1-3 wall clock minutes to full boot Linux and begin running the application with the KVM CPU. I have seen fairly rarely where the KVM simply hangs and makes no progress but simply running again fixed this. Your command looks correct though. Maybe someone who knows more about debugging KVM can comment how to see what the KVM CPU is doing. -Matt From: Rajesh Shashi Kumar via gem5-users mailto:gem5-users@gem5.org>> Sent: Tuesday, December 6, 2022 2:06 PM To: gem5 users mailing list mailto:gem5-users@gem5.org>> Cc: Rajesh Shashi Kumar mailto:reachrajesh...@gmail.com>> Subject: [gem5-users] GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when op
[gem5-users] Re: GPU-FS simulation progress
Okay turns out the issue was indeed using a slow local machine somehow (AMD Ryzen 7 5800H) I ran the same thing on a "AMD EPYC 7451 24-Core Processor x2" and I am able to run square now within 10minutes or so. I guess the last two lines could be because of using an older rocm version. Running ../../gpu/square/bin/square info: running on device Vega 10 XTX [Radeon Vega Frontier Edition] info: architecture on AMD GPU device is: 900 info: allocate host and device mem ( 7.63 MB) info: launch 'vector_square' kernel info: check result *error: 'hipErrorUnknown'(999) at square.cpp:82./script.sh: line 13: 548 Segmentation fault ./myapp* Thank you again for your time on this! -- Rajesh Shashi Kumar On Tue, Dec 6, 2022 at 7:39 PM Rajesh Shashi Kumar wrote: > Thank you for your time. I tried using the provided example for booting > Ubuntu from a disk-image. > > *./build/X86/gem5.opt configs/example/gem5_library/x86-ubuntu-run.py* > > With this, I see that the boot did complete with this example and *kvm-ok* > returns as expected on my machine. > Also, I should mention that I'm using the pre-compiled image/kernel for > GPU-FS to rule out any uncertainty there. > > Term output: > Welcome to Ubuntu 18.04.2 LTS! > > systemd[1]: Set hostname to . > systemd[1]: File /lib/systemd/system/systemd-journald.service:36 > configures an IP firewall (IPAddressDeny=any), but the local system does > not support BPF/cgroup based firewalling. > systemd[1]: Proceeding WITHOUT firewalling in effect! (This warning is > only shown for the first loaded unit using IP firewalling.) > random: systemd: uninitialized urandom read (16 bytes read) > systemd[1]: Reached target Remote File Systems. > [ OK ] Reached target Remote File Systems. > random: systemd: uninitialized urandom read (16 bytes read) > systemd[1]: Created slice System Slice. > [ OK ] Created slice System Slice. > ... > > > On Tue, Dec 6, 2022 at 6:30 PM Poremba, Matthew > wrote: > >> [AMD Official Use Only - General] >> >> At this point I would check if the other KVM scripts are working for you >> (there are some simple tests somewhere like boot Ubuntu and exit). KVM >> works on some CPUs better than others, I believe, or at least this was true >> in the past. I have a few other ideas to try, but I would like to see if >> any other scripts are working first and understand your setup to see if >> other folks might run into the same issue in the future. >> >> >> >> >> >> -Matt >> >> >> >> *From:* Rajesh Shashi Kumar >> *Sent:* Tuesday, December 6, 2022 4:09 PM >> *To:* Poremba, Matthew >> *Cc:* The gem5 Users mailing list >> *Subject:* Re: [gem5-users] GPU-FS simulation progress >> >> >> >> *Caution:* This message originated from an External Source. Use proper >> caution when opening attachments, clicking links, or responding. >> >> >> >> Thank you for your response. >> >> I double checked my image and kernel, I don't think KVM is hanging but >> the progress seems to be a character printed on the term every once in a >> while. I assume this is even before it could finish booting. Not sure if >> fastfoward could help here >> >> My term output: >> m5 terminal: Terminal 0 >> [0.00] Linux version >> >> >> Thanks, >> Rajesh >> >> >> >> On Tue, Dec 6, 2022 at 4:20 PM Poremba, Matthew >> wrote: >> >> [AMD Official Use Only - General] >> >> Hi Rajesh, >> >> >> I looks like no progress has been made since a very early tick number >> (the timestamp print by Linux is equal to the current simulation tick / 1 >> trillion). For reference it should take no more than 1-3 wall clock minutes >> to full boot Linux and begin running the application with the KVM CPU. I >> have seen fairly rarely where the KVM simply hangs and makes no progress >> but simply running again fixed this. Your command looks correct though. >> >> Maybe someone who knows more about debugging KVM can comment how to see >> what the KVM CPU is doing. >> >> >> -Matt >> >> From: Rajesh Shashi Kumar via gem5-users >> Sent: Tuesday, December 6, 2022 2:06 PM >> To: gem5 users mailing list >> Cc: Rajesh Shashi Kumar >> Subject: [gem5-users] GPU-FS simulation progress >> >> Caution: This message originated from an External Source. Use proper >> caution when opening attachments, clicking links, or responding. >> >> Hi, >> >> I followed the instructions on running gpu-fs square using the >> gem5-resources repository. My simulation has been stuck here for a while >> >> ... >> build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR >> (0xc0010015) unsupported by gem5. Skipping. >> build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR >> (0x4b564d05) unsupported by gem5. Skipping. >> build/VEGA_X86/dev/x86/pc.cc:117: warn: Don't know what interrupt to >> clear for console. >> 169640: system.pc.com_1.device: attach terminal 0 >> >> I tried attaching a terminal on a different tab using the following but >> I'm not sure if my image has booted or if the application is run
[gem5-users] Re: GPU-FS simulation progress
Thank you for your time. I tried using the provided example for booting Ubuntu from a disk-image. *./build/X86/gem5.opt configs/example/gem5_library/x86-ubuntu-run.py* With this, I see that the boot did complete with this example and *kvm-ok* returns as expected on my machine. Also, I should mention that I'm using the pre-compiled image/kernel for GPU-FS to rule out any uncertainty there. Term output: Welcome to Ubuntu 18.04.2 LTS! systemd[1]: Set hostname to . systemd[1]: File /lib/systemd/system/systemd-journald.service:36 configures an IP firewall (IPAddressDeny=any), but the local system does not support BPF/cgroup based firewalling. systemd[1]: Proceeding WITHOUT firewalling in effect! (This warning is only shown for the first loaded unit using IP firewalling.) random: systemd: uninitialized urandom read (16 bytes read) systemd[1]: Reached target Remote File Systems. [ OK ] Reached target Remote File Systems. random: systemd: uninitialized urandom read (16 bytes read) systemd[1]: Created slice System Slice. [ OK ] Created slice System Slice. ... On Tue, Dec 6, 2022 at 6:30 PM Poremba, Matthew wrote: > [AMD Official Use Only - General] > > At this point I would check if the other KVM scripts are working for you > (there are some simple tests somewhere like boot Ubuntu and exit). KVM > works on some CPUs better than others, I believe, or at least this was true > in the past. I have a few other ideas to try, but I would like to see if > any other scripts are working first and understand your setup to see if > other folks might run into the same issue in the future. > > > > > > -Matt > > > > *From:* Rajesh Shashi Kumar > *Sent:* Tuesday, December 6, 2022 4:09 PM > *To:* Poremba, Matthew > *Cc:* The gem5 Users mailing list > *Subject:* Re: [gem5-users] GPU-FS simulation progress > > > > *Caution:* This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > > > Thank you for your response. > > I double checked my image and kernel, I don't think KVM is hanging but the > progress seems to be a character printed on the term every once in a while. > I assume this is even before it could finish booting. Not sure if > fastfoward could help here > > My term output: > m5 terminal: Terminal 0 > [0.00] Linux version > > > Thanks, > Rajesh > > > > On Tue, Dec 6, 2022 at 4:20 PM Poremba, Matthew > wrote: > > [AMD Official Use Only - General] > > Hi Rajesh, > > > I looks like no progress has been made since a very early tick number (the > timestamp print by Linux is equal to the current simulation tick / 1 > trillion). For reference it should take no more than 1-3 wall clock minutes > to full boot Linux and begin running the application with the KVM CPU. I > have seen fairly rarely where the KVM simply hangs and makes no progress > but simply running again fixed this. Your command looks correct though. > > Maybe someone who knows more about debugging KVM can comment how to see > what the KVM CPU is doing. > > > -Matt > > From: Rajesh Shashi Kumar via gem5-users > Sent: Tuesday, December 6, 2022 2:06 PM > To: gem5 users mailing list > Cc: Rajesh Shashi Kumar > Subject: [gem5-users] GPU-FS simulation progress > > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > Hi, > > I followed the instructions on running gpu-fs square using the > gem5-resources repository. My simulation has been stuck here for a while > > ... > build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR > (0xc0010015) unsupported by gem5. Skipping. > build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR > (0x4b564d05) unsupported by gem5. Skipping. > build/VEGA_X86/dev/x86/pc.cc:117: warn: Don't know what interrupt to clear > for console. > 169640: system.pc.com_1.device: attach terminal 0 > > I tried attaching a terminal on a different tab using the following but > I'm not sure if my image has booted or if the application is running: > $ util/term/m5term localhost 3456 > m5 terminal: Terminal 0 > [0.0 > > Any advice is appreciated! > > My run command: > build/VEGA_X86/gem5.opt configs/example/gpufs/vega10_kvm.py --disk-image > ../disk-image/rocm42/rocm42-image/rocm42 --kernel > ../vmlinux-5.4.0-105-generic --gpu-mmio-trace ../vega_mmio.log --app > ../../gpu/square/bin/square > > Thanks, > Rajesh > > ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org
[gem5-users] Re: GPU-FS simulation progress
[AMD Official Use Only - General] At this point I would check if the other KVM scripts are working for you (there are some simple tests somewhere like boot Ubuntu and exit). KVM works on some CPUs better than others, I believe, or at least this was true in the past. I have a few other ideas to try, but I would like to see if any other scripts are working first and understand your setup to see if other folks might run into the same issue in the future. -Matt From: Rajesh Shashi Kumar Sent: Tuesday, December 6, 2022 4:09 PM To: Poremba, Matthew Cc: The gem5 Users mailing list Subject: Re: [gem5-users] GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Thank you for your response. I double checked my image and kernel, I don't think KVM is hanging but the progress seems to be a character printed on the term every once in a while. I assume this is even before it could finish booting. Not sure if fastfoward could help here My term output: m5 terminal: Terminal 0 [0.00] Linux version Thanks, Rajesh On Tue, Dec 6, 2022 at 4:20 PM Poremba, Matthew mailto:matthew.pore...@amd.com>> wrote: [AMD Official Use Only - General] Hi Rajesh, I looks like no progress has been made since a very early tick number (the timestamp print by Linux is equal to the current simulation tick / 1 trillion). For reference it should take no more than 1-3 wall clock minutes to full boot Linux and begin running the application with the KVM CPU. I have seen fairly rarely where the KVM simply hangs and makes no progress but simply running again fixed this. Your command looks correct though. Maybe someone who knows more about debugging KVM can comment how to see what the KVM CPU is doing. -Matt From: Rajesh Shashi Kumar via gem5-users mailto:gem5-users@gem5.org>> Sent: Tuesday, December 6, 2022 2:06 PM To: gem5 users mailing list mailto:gem5-users@gem5.org>> Cc: Rajesh Shashi Kumar mailto:reachrajesh...@gmail.com>> Subject: [gem5-users] GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Hi, I followed the instructions on running gpu-fs square using the gem5-resources repository. My simulation has been stuck here for a while ... build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR (0xc0010015) unsupported by gem5. Skipping. build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR (0x4b564d05) unsupported by gem5. Skipping. build/VEGA_X86/dev/x86/pc.cc:117: warn: Don't know what interrupt to clear for console. 169640: system.pc.com_1.device: attach terminal 0 I tried attaching a terminal on a different tab using the following but I'm not sure if my image has booted or if the application is running: $ util/term/m5term localhost 3456 m5 terminal: Terminal 0 [0.0 Any advice is appreciated! My run command: build/VEGA_X86/gem5.opt configs/example/gpufs/vega10_kvm.py --disk-image ../disk-image/rocm42/rocm42-image/rocm42 --kernel ../vmlinux-5.4.0-105-generic --gpu-mmio-trace ../vega_mmio.log --app ../../gpu/square/bin/square Thanks, Rajesh ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org
[gem5-users] Re: GPU-FS simulation progress
Thank you for your response. I double checked my image and kernel, I don't think KVM is hanging but the progress seems to be a character printed on the term every once in a while. I assume this is even before it could finish booting. Not sure if fastfoward could help here My term output: m5 terminal: Terminal 0 [0.00] Linux version Thanks, Rajesh On Tue, Dec 6, 2022 at 4:20 PM Poremba, Matthew wrote: > [AMD Official Use Only - General] > > Hi Rajesh, > > > I looks like no progress has been made since a very early tick number (the > timestamp print by Linux is equal to the current simulation tick / 1 > trillion). For reference it should take no more than 1-3 wall clock minutes > to full boot Linux and begin running the application with the KVM CPU. I > have seen fairly rarely where the KVM simply hangs and makes no progress > but simply running again fixed this. Your command looks correct though. > > Maybe someone who knows more about debugging KVM can comment how to see > what the KVM CPU is doing. > > > -Matt > > From: Rajesh Shashi Kumar via gem5-users > Sent: Tuesday, December 6, 2022 2:06 PM > To: gem5 users mailing list > Cc: Rajesh Shashi Kumar > Subject: [gem5-users] GPU-FS simulation progress > > Caution: This message originated from an External Source. Use proper > caution when opening attachments, clicking links, or responding. > > Hi, > > I followed the instructions on running gpu-fs square using the > gem5-resources repository. My simulation has been stuck here for a while > > ... > build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR > (0xc0010015) unsupported by gem5. Skipping. > build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR > (0x4b564d05) unsupported by gem5. Skipping. > build/VEGA_X86/dev/x86/pc.cc:117: warn: Don't know what interrupt to clear > for console. > 169640: system.pc.com_1.device: attach terminal 0 > > I tried attaching a terminal on a different tab using the following but > I'm not sure if my image has booted or if the application is running: > $ util/term/m5term localhost 3456 > m5 terminal: Terminal 0 > [0.0 > > Any advice is appreciated! > > My run command: > build/VEGA_X86/gem5.opt configs/example/gpufs/vega10_kvm.py --disk-image > ../disk-image/rocm42/rocm42-image/rocm42 --kernel > ../vmlinux-5.4.0-105-generic --gpu-mmio-trace ../vega_mmio.log --app > ../../gpu/square/bin/square > > Thanks, > Rajesh > ___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org
[gem5-users] Re: GPU-FS simulation progress
[AMD Official Use Only - General] Hi Rajesh, I looks like no progress has been made since a very early tick number (the timestamp print by Linux is equal to the current simulation tick / 1 trillion). For reference it should take no more than 1-3 wall clock minutes to full boot Linux and begin running the application with the KVM CPU. I have seen fairly rarely where the KVM simply hangs and makes no progress but simply running again fixed this. Your command looks correct though. Maybe someone who knows more about debugging KVM can comment how to see what the KVM CPU is doing. -Matt From: Rajesh Shashi Kumar via gem5-users Sent: Tuesday, December 6, 2022 2:06 PM To: gem5 users mailing list Cc: Rajesh Shashi Kumar Subject: [gem5-users] GPU-FS simulation progress Caution: This message originated from an External Source. Use proper caution when opening attachments, clicking links, or responding. Hi, I followed the instructions on running gpu-fs square using the gem5-resources repository. My simulation has been stuck here for a while ... build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR (0xc0010015) unsupported by gem5. Skipping. build/VEGA_X86/arch/x86/kvm/x86_cpu.cc:1561: warn: kvm-x86: MSR (0x4b564d05) unsupported by gem5. Skipping. build/VEGA_X86/dev/x86/pc.cc:117: warn: Don't know what interrupt to clear for console. 169640: system.pc.com_1.device: attach terminal 0 I tried attaching a terminal on a different tab using the following but I'm not sure if my image has booted or if the application is running: $ util/term/m5term localhost 3456 m5 terminal: Terminal 0 [0.0 Any advice is appreciated! My run command: build/VEGA_X86/gem5.opt configs/example/gpufs/vega10_kvm.py --disk-image ../disk-image/rocm42/rocm42-image/rocm42 --kernel ../vmlinux-5.4.0-105-generic --gpu-mmio-trace ../vega_mmio.log --app ../../gpu/square/bin/square Thanks, Rajesh <>___ gem5-users mailing list -- gem5-users@gem5.org To unsubscribe send an email to gem5-users-le...@gem5.org