[AMD Official Use Only - General]
Hello,
If you want to run CPU+GPU right now you will probably need to use SE mode.
Timing CPUs are not supported in GPUFS configs so the CPU network is completely
ignored as KVM/atomic bypass network. If that doesn’t matter, then you will
need to create a
[AMD Official Use Only - General]
Hi Pau,
Typically, we use docker to compile the binaries. This way you don’t need to
have the libraries installed on the host machine. You can find the official
docker image to build GPU applications here:
that bios instead.
-Matt
From: Pau Galindo Figuerola
Sent: Monday, April 29, 2024 12:21 PM
To: Poremba, Matthew
Cc: The gem5 Users mailing list
Subject: Re: [gem5-users] GCN3 Full System
Caution: This message originated from an External Source. Use proper caution
when opening attachments
rth a try. For example:
export HSA_OVERRIDE_GFX_VERSION="9.0.2"
-Matt
From: Pau Galindo Figuerola
Sent: Friday, April 26, 2024 10:44 AM
To: Poremba, Matthew
Cc: The gem5 Users mailing list
Subject: Re: [gem5-users] GCN3 Full System
Caution: This message originated from an External Source.
[Public]
Hi Pau,
Does the host system have 4 CPUs available for KVM to use? I have seen similar
errors occasionally with even 2 CPUs and simply rerunning the simulation seemed
to fix it. Unfortunately, I am not a KVM expert so I am not entirely sure how
to make it more robust.
-Matt
[Public]
Hi Pau,
From: Pau Galindo Figuerola via gem5-users
Sent: Friday, March 22, 2024 10:47 AM
To: The gem5 Users mailing list
Cc: Pau Galindo Figuerola
Subject: [gem5-users] GPU FS Multiple CPU
Caution: This message originated from an External Source. Use proper caution
when opening
[AMD Official Use Only - General]
Hi Pau,
The dgpu_mem_size parameter will only change the memory size for gem5 while the
GPU driver uses an MMIO register value to determine the memory size. The issue
you are seeing is the driver thinks there is still 16GB of memory and it
attempts to write
[Public]
Hi Sandy,
Depending on the benchmark, OpenCL might do an online compile (i.e., compile
the kernels right before running them). If you are using KVM it should just
work. Otherwise, the online compilation will take a significant amount of
simulation time and offline compiling would
ope it works!
Regards,
Pau
El mar, 19 dic 2023 18:57, Poremba, Matthew via gem5-users
mailto:gem5-users@gem5.org>> escribió:
[AMD Official Use Only - General]
Hi Sandy,
Could you share the file “m5out/system.pc.com_1.device” as well?
You could also try using vega10_atomic.py instead o
To: The gem5 Users mailing list
Cc: 关富润 <448367...@qq.com>; Poremba, Matthew ; VISHNU
RAMADAS
Subject: Re: [gem5-users] Fail to run gpu-fs
Caution: This message originated from an External Source. Use proper caution
when opening attachments, clicking links, or responding.
Hi Sandy,
C
[Public]
Hi Pau,
It’s probably not possible without a lot of set up, but I have found that GPUs
that are “not supported” sometimes still work anyway with certain compute
stacks. You would probably need to:
* Find a BIOS for a GCN3 GPU or rip it from a real GPU – This is so the
driver
version or full system mode is another option
to use Vega ISA.
For the docker automatically quitting, you will have to do `docker run -it …`
to start an interactive session.
-Matt
From: Anoop Mysore
Sent: Monday, September 11, 2023 10:33 AM
To: Poremba, Matthew
Cc: Matt Sinclair ; The gem5 Users
[Public]
This is not the first time I am hearing about this issue. It seems stable
needs to be hotfixed for GPU.
For now, you can try the develop branch instead. It is tested quite well so it
is relatively stable anyway despite the name.
-Matt
From: Pau Galindo Figuerola via gem5-users
.
-Matt
From: Anoop Mysore
Sent: Friday, September 8, 2023 7:33 AM
To: Matt Sinclair
Cc: The gem5 Users mailing list ; Poremba, Matthew
Subject: Re: [gem5-users] Re: Error in an application running on gem5 GCN3
(with apu_se.py)
Caution: This message originated from an External Source. Use proper
[AMD Official Use Only - General]
Hi,
Can you show the output you removed? What is being printed right before the
crash?
Thanks,
Matt
From: Matt Sinclair
Sent: Sunday, July 23, 2023 10:37 AM
To: The gem5 Users mailing list
Cc: l...@163.com <17861509...@163.com>; Poremba, Matthew
S
[Public]
Hi,
No worries about the questions! I will try to answer them all, so this will be
a long email :
The disconnected (or disjoint) Ruby network is essentially the same as the APU
Ruby network used in SE mode - That is, it combines two Ruby protocols in one
protocol (MOESI_AMD_base
:40 AM
To: The gem5 Users mailing list
Cc: Anoop Mysore ; Poremba, Matthew
Subject: Re: [gem5-users] Re: GPU-FS simulation progress
Caution: This message originated from an External Source. Use proper caution
when opening attachments, clicking links, or responding.
Maybe I'm missing something
[AMD Official Use Only - General]
Hi,
SE mode supports ROCm 4.0 only. What version of ROCm is installed on the host
machine? If they are different the IOCTLs might be as well.
You can alternately try full system with GPU if you want to avoid docker.
-Matt
From: Anoop Mysore via
to FSConfig.py
on the most recent develop branch of gem5 and was seeing a /dev/pmem0 device.
I didn’t touch the e820 table though. Maybe you could try with the latest gem5
resources and see if it works.
-Matt
From: Vincent Abraham
Sent: Monday, May 22, 2023 1:14 PM
To: Poremba, Matthew
Cc: The gem5
[AMD Official Use Only - General]
Hi,
Are you building pmem as a module as described in the blog? (“ PMEM:
Persistent memory block device support”) If so, I would try building it into
the kernel directly. It is possibly looking for the module for your compiled
kernel and does not find it
[AMD Official Use Only - General]
Hello,
I don't know how to change the frequency or if it will work, but in your gem5
python script you can call "m5.simulate(10 * 1e12)" to simulate 10 seconds
worth of ticks at a time and change the frequency after each call to that.
-Matt
From: Mejbaul
[AMD Official Use Only - General]
Hi,
1. Full system mode in gem5 has "two" outputs. There is the simulator output
(what you are showing in the email) and the terminal output. In SE mode these
are combined into one output. What is being shown in the simulator output.
This is really
[AMD Official Use Only - General]
Hi,
GPU_RfO and GPU_VIPER_Region were deprecated, mostly because there is no one to
help maintain all of the GPU protocols, so we opted to focus on just one. I
don't think there have been any Ruby/SLICC changes that would have broken the
ability to build
for full system GPU.
-Matt
From: Rajesh Shashi Kumar
Sent: Wednesday, December 7, 2022 11:32 AM
To: Poremba, Matthew
Cc: The gem5 Users mailing list
Subject: Re: [gem5-users] GPU-FS simulation progress
Caution: This message originated from an External Source. Use proper caution
when opening
on my
local setup so it would be difficult for me to debug why it does not work for
other folks.
-Matt
From: Rajesh Shashi Kumar
Sent: Tuesday, December 6, 2022 5:49 PM
To: Poremba, Matthew
Cc: The gem5 Users mailing list
Subject: Re: [gem5-users] GPU-FS simulation progress
Caution
to try, but I would like to see if any other scripts are
working first and understand your setup to see if other folks might run into
the same issue in the future.
-Matt
From: Rajesh Shashi Kumar
Sent: Tuesday, December 6, 2022 4:09 PM
To: Poremba, Matthew
Cc: The gem5 Users mailing list
[AMD Official Use Only - General]
Hi Rajesh,
I looks like no progress has been made since a very early tick number (the
timestamp print by Linux is equal to the current simulation tick / 1 trillion).
For reference it should take no more than 1-3 wall clock minutes to full boot
Linux and
[AMD Official Use Only - General]
Hi Rajesh,
Thanks for the update. I'm glad you were able to get it worked out. Ideally
we wouldn't *require* sudo access but for KVM in general I think it is going to
highly depend on how the system was setup and there are some things that packer
won't be
[AMD Official Use Only - General]
Hi,
The rocclr, panic, and unimplemented instructions errors/warnings seem to be
caused by this patch:
https://gem5-review.googlesource.com/c/public/gem5/+/64831. It is likely the
ROCm stack is taking a different code path with the different processor
[AMD Official Use Only]
These would be valid for both as they both use the same cache protocol files.
I'm not very familiar with how dGPU is hacked up in SE mode to look like a
dGPU...
-Matt
From: David Fong
Sent: Thursday, March 17, 2022 9:57 AM
To: Poremba, Matthew ; Matt Sinclair
; Poremba, Matthew ;
Bharadwaj, Srikant
Subject: RE: gem5 : X86 + GCN3 (gfx801) + test_fwd_lrn
[CAUTION: External Email]
Matt P or Srikant: can you please help David with the latency question? You
know the answers better than I do here.
Matt
From: David Fong mailto:da...@chronostech.com>>
[Public]
Hi David,
You are hitting the limit on the number of same MachineTypes in a Ruby network.
You can change this by modifying the `build_opts/GCN_X86` file and adding a
new line with `NUMBER_BITS_PER_SET = '128'`, or higher, and then recompile
gem5. As far as I know there is not a
[AMD Official Use Only]
Hi David,
I generally look at the shader_active_ticks stat for very high level
performance comparisons.
-Matt
From: David Fong
Sent: Friday, March 4, 2022 10:27 AM
To: Poremba, Matthew ; gem5 users mailing list
; Bobby Bruce ; Matt Sinclair
; Kyle Roarty
Subject
benchmarks/test_fwd_softmax
-cdnnmark_test_fwd_softmax --options="-config
gem5-resources/src/gpu/DNNMark/config_example/softmax_config.dnnmark -mmap
gem5-resources/src/gpu/DNNMark/mmap.bin" --gfx-version=gfx900 --dgpu
-Matt
From: David Fong
Sent: Friday, March 4, 2022 10:01 AM
To: Poremba,
[AMD Official Use Only]
Hi,
I don't know if this is what is causing this specific forking problem, but
gfx900 is VEGA not GCN3. There is a separate build for VEGA. If you want GCN3
dGPU you want gfx803.
-Matt
From: David Fong via gem5-users
Sent: Friday, March 4, 2022 9:34 AM
To: Bobby
[AMD Official Use Only]
Hi Imad,
Yes, you should be able to run DGPU in SE mode with gfx803 on the stable
branch. On develop, gfx900 is also a dgpu option if you build VEGA_X86.
-Matt
From: Imad Al Assir
Sent: Friday, October 1, 2021 4:47 PM
To: Poremba, Matthew ; gem5 users mailing list
[AMD Official Use Only]
Hi Imad,
It is still not supported on stable nor develop. We are submitting patches for
this over time as a way to not overwhelm the (volunteer) reviewers. It will
most likely be supported in the next gem5 release.
-Matt
From: Imad Al Assir via gem5-users
Sent:
[AMD Official Use Only]
Hi Imad,
Yes, the docker seems to have broken in the past few days.
Regarding the benchmark not completing, please change your command to use 3
CPUs:
docker run --rm -v $PWD/gem5:/gem5 -v $PWD/gem5-resources:/gem5-resources \
-w /gem5
[AMD Official Use Only]
Hi Imad & Matt,
I am seeing the same error as of this morning. Not sure quite what the issue
is, but I suspect not everything is choosing a specific package version and
something was updated in an apt repo or one of the repos rocBLAS's install.sh
script pulls from.
[AMD Public Use]
Hi,
Develop branch has the latest Dockerfile. Note that GCN3 won't be "officially"
part of gem5 until 21.0 release (in a few weeks).
-Matt
-Original Message-
From: xpf via gem5-users
Sent: Monday, March 8, 2021 11:21 PM
To: gem5-users@gem5.org
Cc:
in gem5. Different compiler maybe? The change
that broke this for me literally just moves files around so I have no ideas how
that caused it to break.
-Matt
From: Matt Sinclair
Sent: Friday, November 6, 2020 2:55 PM
To: Daniel Gerzhoy
Cc: Kyle Roarty ; Poremba, Matthew ;
Yichen Yang ; gem5 users
to this build.
-Matt
From: Yichen Yang
Sent: Friday, November 6, 2020 1:30 PM
To: Poremba, Matthew
Cc: gem5 users mailing list
Subject: Re: [gem5-users] gem5 GCN3 GPU model running issues
[CAUTION: External Email]
Thanks!
I tried the develop branch. But running into new problems
warn: ignoring
[AMD Public Use]
Hi Yichen,
Based on the changes I see you've made, it seems like you are using an older
version of gem5. These should all be fixed, including the error you are
seeing, on the tip of develop.
Keep in mind GCN3 was not officially part of the gem5 20.1 release, so the most
up
[AMD Official Use Only - Internal Distribution Only]
You could probably use the simpoint_start_insts vector param in CPU to have
simulation exit to your python script and dump/reset stats there. For example:
cpu.simpoint_start_insts = [x*N for x in range(1000)]
where N is the instruction
Hi Reza,
Memory controllers (including NVMain) are behind a slave port and are generally
not aware of what the cache is doing – In other words, you want to look in the
gem5 source if you want to do something special for LLC misses. This will be
different for Ruby and classic caches:
If you
45 matches
Mail list logo