[gem5-users] Re: Gem5 GCN3 DNNMark benchmark error (fwd_softmax is ok, but others are not)

2022-02-12 Thread Matt Sinclair via gem5-users
Thanks this is helpful.  Kyle and I went through the error and we haven't
run on a machine with enough memory to run batch size 100 (which is what
bwd_activation assumes by default).  However, we have gotten it to run with
up to batch size 50.

We think the failure you were seeing was essentially happening because we
weren't testing bwd_activation in the nightly/weekly regressions, and thus
missed that the file we use to generate the MIOpen cachefiles for the
DNNMark kernels did not have the appropriate kernel for bwd_activation.
Kyle created a patch to fix this problem:
https://gem5-review.googlesource.com/c/public/gem5-resources/+/56789.

You will need to pull this patch and rerun generate_cachefiles before
trying to run again.  Moreover, since we only know it works up to batch
size 50, you may consider changing the batch size here:
https://gem5.googlesource.com/public/gem5-resources/+/refs/heads/stable/src/gpu/DNNMark/config_example/activation_config.dnnmark#6,
to something <= 50 since N represents the batch size.  Alternatively if you
need > 50 batch size, you can try running again on the larger machine you
mentioned before, but since we haven't run it on such a large machine yet
we don't know exactly what will happen.

Hope this helps,
Matt

On Fri, Feb 11, 2022 at 12:11 PM 1575883782 via gem5-users <
gem5-users@gem5.org> wrote:

> yeah, I running DNNMark inside docker, and the version is v21-2. I run
> command by remote-container plugin of VsCode.
>
> ---Original---
> *From:* "Matt Sinclair via gem5-users"
> *Date:* Sat, Feb 12, 2022 01:41 AM
> *To:* "gem5 users mailing list";
> *Cc:* "1575883782"<1575883...@qq.com>;"Kyle Roarty";"Matt
> Sinclair";
> *Subject:* [gem5-users] Re: Gem5 GCN3 DNNMark benchmark error
> (fwd_softmax is ok, but others are not)
>
> One more question for you, original poster: are you running DNNMark inside
> the docker resources we provided:
> http://resources.gem5.org/resources/dnn-mark?
>
> Or are you trying to get this running on your machine directly?
>
> Matt
>
> On Fri, Feb 11, 2022 at 11:37 AM Matt Sinclair <
> mattdsinclair.w...@gmail.com> wrote:
>
>> Kyle, can you please help with this?  I don't recall when we last tested
>> bwd_act.
>>
>> Matt
>>
>> On Fri, Feb 11, 2022 at 2:18 AM 1575883782 via gem5-users <
>> gem5-users@gem5.org> wrote:
>>
>>> Hi,
>>>
>>> I was trying to run DNNMark benchmark with its GCN3 GPU model following the 
>>> instructions
>>> on http://resources.gem5.org/resources/dnn-mark 
>>> .
>>>
>>> I succeed running fwd_softmax, but when I run other layers, I met some 
>>> problems. For example, "bwd_activation".
>>>
>>>
>>> I tried to run gem5 DNNMark bwd_activation bechmark in 2 computers.
>>>
>>>
>>> First computer has 32G Mem size. Gem5 could run fwd_softmax successfully, 
>>> but always was killed while running bwd_activation. The error message was 
>>> "Killed" + process id. No other messages. I guess it's as this computer's 
>>> mem size is not enough to run it.
>>>
>>>
>>> Second computer has 256G Mem size. Gem5 could run fwd_softmax successfully. 
>>> But some problems happened while running bwd_activation. I solved some, but 
>>> have not solved all. Error messages are:
>>>
>>>
>>> > I0909 01:46:50.680040   100 dnn_wrapper.h:341] enter 
>>> > dnnmarkActivationBackward func
>>> > build/GCN3_X86/sim/mem_pool.cc:110: warn: Reached m5ops MMIO region
>>> > build/GCN3_X86/sim/mem_pool.cc:110: warn: Reached m5ops MMIO region
>>> > build/GCN3_X86/sim/mem_pool.cc:110: warn: Reached m5ops MMIO region
>>> > build/GCN3_X86/sim/mem_pool.cc:110: warn: Reached m5ops MMIO region
>>> > build/GCN3_X86/arch/x86/faults.cc:170: panic: Tried to read unmapped 
>>> > address 0.
>>> > PC: 0x7fffeef84b80, Instr:   FMUL2_M : ldfp87   %ufp1, DS:[rdx]
>>> > Memory Usage: 46436124 KBytes
>>> > Program aborted at tick 10680071080500
>>> >
>>>
>>>
>>> sometimes, error are:
>>>
>>> > panic: Tried to write unmapped address 0x2b95d881.
>>>
>>> or
>>>
>>> > panic: Tried to write unmapped address 0x3.
>>>
>>>
>>> According to my log, I found the problem happended on 
>>> "dnnmarkActivationBackward" func.
>>>
>>> > LOG(INFO) << "enter dnnmarkActivationBackward func";
>>> > #ifdef AMD_MIOPEN
>>> >   MIOPEN_CALL(miopenActivationBackward(
>>> >   mode == COMPOSED ?
>>> >   handle.GetMIOpen(idx) : handle.GetMIOpen(),
>>> >   activation_desc.Get(),
>>> >   alpha,
>>> >   top_desc.Get(), y,
>>> >   top_desc.Get(), dy,
>>> >   bottom_desc.Get(), x,
>>> >   beta,
>>> >   bottom_desc.Get(), dx));
>>> > #endif
>>> >   LOG(INFO) << "exit dnnmarkActivationBackward func";
>>>
>>>
>>> It seems to be a miopen interface functions. I don't know how to solve it. 
>>> Someone could help me?
>>>
>>>
>>> PS:
>>>
>>> my gem5 version is v21-2, and docker image is v21-2.
>>>
>>> my run command is: build/GCN3_X86/gem5.opt 

[gem5-users] Re: Difference between XBase and XURa .isa operands

2022-02-12 Thread Pedro Henrique Exenberger Becker via gem5-users
Hi Jason,

Thanks for the feedback. That was more or less the same conclusion I had...
I tried both and verified that they generate a very similar code, except
for the name of the variable.
Perhaps it's just for readability, who knows.

Thanks again, anyway!
Pedro.

Jason Z via gem5-users  escreveu no dia sábado,
12/02/2022 à(s) 06:08:

> Hey Pedro,
>
> I ran into a similar issue with naming like that with trying to create new
> instructions, and the best I could tell it seems like the XUR are for
> microops, but I'm honestly just guessing based on what I saw previously,
> and I ended up just using XBase since I was creating a store instruction
> and modeling it after an STRX64 instruction, so I tried to stick to that
> framework...
>
> Sorry I don't have a better answer, but if anyone else has any insight to
> any reference to their meanings, I'd be interested in it as well...
>
> Respectfully,
>
> Jason Z.
> ___
> gem5-users mailing list -- gem5-users@gem5.org
> To unsubscribe send an email to gem5-users-le...@gem5.org
> %(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s
>


-- 
Pedro Henrique Exenberger Becker
Ph.D. Student at Universitat Politècnica de Catalunya
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Re: riscv-ubuntu 20.04 FS mode

2022-02-12 Thread Νικόλαος Ταμπουρατζής via gem5-users

Dear all,

I have successfully emulated the riscv-ubuntu.img through qemu! The  
problem was that I use Ubuntu 20.04 and it requires the Hirsute’s  
version of u-boot-qemu (wget  
http://mirrors.kernel.org/ubuntu/pool/main/u/u-boot/u-boot-qemu_2021.01+dfsg-3ubuntu9_all.deb) and not the standard through the apt  
install.


Now I have two questions:

1) How can I disable the huge number of services in order to speedup  
the gem5 boot process through qemu?


2) When I execute the following: "./build/RISCV/gem5.opt  
configs/example/gem5_library/riscv-ubuntu-run.py", it downloads  
automatically the riscv-ubuntu-20.04-img. How can I set another image?  
The --disk-image option is not working.



Thank you in advance!!!
Best regards,
Nikos


Quoting Νικόλαος Ταμπουρατζής via gem5-users :


Dear Hoa,

Thank you very much for your information! I try to emulate the  
ubuntu-image through qemu (following this tutorial:  
http://resources.gem5.org/resources/riscv-ubuntu) and I get the  
following TFTP error:


I appreciate any help!!

Best regards,
Nikos


cossim@cossim-virtual-machine:~/riscv-ubuntu$  
./qemu/build/qemu-system-riscv64 -machine virt -nographic \

-m 16384 -smp 8 \
-bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
-kernel /usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
-device virtio-net-device,netdev=eth0 \
-netdev user,id=eth0,hostfwd=tcp::-:22 \
-drive file=ubuntu.img,format=raw,if=virtio


OpenSBI v0.9
   _  _
  / __ \  / |  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |) | |_) || |_
  \/| .__/ \___|_| |_|_/|/_|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count   : 8
Firmware Base : 0x8000
Firmware Size : 156 KB
Runtime SBI Version   : 0.2

Domain0 Name  : root
Domain0 Boot HART : 5
Domain0 HARTs : 0*,1*,2*,3*,4*,5*,6*,7*
Domain0 Region00  : 0x8000-0x8003 ()
Domain0 Region01  : 0x-0x (R,W,X)
Domain0 Next Address  : 0x8020
Domain0 Next Arg1 : 0x8220
Domain0 Next Mode : S-mode
Domain0 SysReset  : yes

Boot HART ID  : 5
Boot HART Domain  : root
Boot HART ISA : rv64imafdcsu
Boot HART Features: scounteren,mcounteren,time
Boot HART PMP Count   : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count  : 0
Boot HART MHPM Count  : 0
Boot HART MIDELEG : 0x0222
Boot HART MEDELEG : 0xb109


U-Boot 2021.01+dfsg-3ubuntu0~20.04.4 (Sep 21 2021 - 15:55:38 +)

CPU:   rv64imafdcsu
Model: riscv-virtio,qemu
DRAM:  16 GiB
In:uart@1000
Out:   uart@1000
Err:   uart@1000
Net:   eth0: virtio-net#0
Hit any key to stop autoboot:  0

Device 0: 1af4 VirtIO Block Device
Type: Hard Disk
Capacity: 13824.0 MB = 13.5 GB (28311552 x 512)
... is now current device
Scanning virtio 0:1...
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
755 bytes read in 4 ms (183.6 KiB/s)
U-Boot menu
1:  Ubuntu 20.04.3 LTS 5.11.0-1017-generic
2:  Ubuntu 20.04.3 LTS 5.11.0-1017-generic (rescue target)
Enter choice: 1:Ubuntu 20.04.3 LTS 5.11.0-1017-generic
Retrieving file: /boot/initrd.img-5.11.0-1017-generic
170008555 bytes read in 728 ms (222.7 MiB/s)
Retrieving file: /boot/vmlinuz-5.11.0-1017-generic
25258496 bytes read in 148 ms (162.8 MiB/s)
append: root=LABEL=cloudimg-rootfs ro earlycon
Retrieving file: /lib/firmware/5.11.0-1017-generic/device-tree/qemu-riscv.dtb
Failed to load '/lib/firmware/5.11.0-1017-generic/device-tree/qemu-riscv.dtb'
Skipping l0 for failure retrieving fdt
2:  Ubuntu 20.04.3 LTS 5.11.0-1017-generic (rescue target)
Retrieving file: /boot/initrd.img-5.11.0-1017-generic
170008555 bytes read in 44 ms (3.6 GiB/s)
Retrieving file: /boot/vmlinuz-5.11.0-1017-generic
25258496 bytes read in 12 ms (2 GiB/s)
append: root=LABEL=cloudimg-rootfs ro earlycon single
Retrieving file: /lib/firmware/5.11.0-1017-generic/device-tree/qemu-riscv.dtb
Failed to load '/lib/firmware/5.11.0-1017-generic/device-tree/qemu-riscv.dtb'
Skipping l0r for failure retrieving fdt
SCRIPT FAILED: continuing...
libfdt fdt_check_header(): FDT_ERR_BADMAGIC
Scanning disk virtio-blk#8...
** Unrecognized filesystem type **
** Unrecognized filesystem type **
Found 6 disks
** Invalid partition 21 **
Cannot read EFI system partition
BootOrder not defined
EFI boot manager: Cannot load any image
Scanning virtio 0:f...
** Unable to read file / **
Failed to load '/'
libfdt fdt_check_header(): FDT_ERR_BADMAGIC
BootOrder not defined
EFI boot manager: Cannot load any image
scanning bus for devices...


[gem5-users] Re: riscv-ubuntu 20.04 FS mode

2022-02-12 Thread Νικόλαος Ταμπουρατζής via gem5-users

Dear Hoa,

Thank you very much for your information! I try to emulate the  
ubuntu-image through qemu (following this tutorial:  
http://resources.gem5.org/resources/riscv-ubuntu) and I get the  
following TFTP error:


I appreciate any help!!

Best regards,
Nikos


cossim@cossim-virtual-machine:~/riscv-ubuntu$  
./qemu/build/qemu-system-riscv64 -machine virt -nographic \

 -m 16384 -smp 8 \
 -bios /usr/lib/riscv64-linux-gnu/opensbi/generic/fw_jump.elf \
 -kernel /usr/lib/u-boot/qemu-riscv64_smode/uboot.elf \
 -device virtio-net-device,netdev=eth0 \
 -netdev user,id=eth0,hostfwd=tcp::-:22 \
 -drive file=ubuntu.img,format=raw,if=virtio


OpenSBI v0.9
   _  _
  / __ \  / |  _ \_   _|
 | |  | |_ __   ___ _ __ | (___ | |_) || |
 | |  | | '_ \ / _ \ '_ \ \___ \|  _ < | |
 | |__| | |_) |  __/ | | |) | |_) || |_
  \/| .__/ \___|_| |_|_/|/_|
| |
|_|

Platform Name : riscv-virtio,qemu
Platform Features : timer,mfdeleg
Platform HART Count   : 8
Firmware Base : 0x8000
Firmware Size : 156 KB
Runtime SBI Version   : 0.2

Domain0 Name  : root
Domain0 Boot HART : 5
Domain0 HARTs : 0*,1*,2*,3*,4*,5*,6*,7*
Domain0 Region00  : 0x8000-0x8003 ()
Domain0 Region01  : 0x-0x (R,W,X)
Domain0 Next Address  : 0x8020
Domain0 Next Arg1 : 0x8220
Domain0 Next Mode : S-mode
Domain0 SysReset  : yes

Boot HART ID  : 5
Boot HART Domain  : root
Boot HART ISA : rv64imafdcsu
Boot HART Features: scounteren,mcounteren,time
Boot HART PMP Count   : 16
Boot HART PMP Granularity : 4
Boot HART PMP Address Bits: 54
Boot HART MHPM Count  : 0
Boot HART MHPM Count  : 0
Boot HART MIDELEG : 0x0222
Boot HART MEDELEG : 0xb109


U-Boot 2021.01+dfsg-3ubuntu0~20.04.4 (Sep 21 2021 - 15:55:38 +)

CPU:   rv64imafdcsu
Model: riscv-virtio,qemu
DRAM:  16 GiB
In:uart@1000
Out:   uart@1000
Err:   uart@1000
Net:   eth0: virtio-net#0
Hit any key to stop autoboot:  0

Device 0: 1af4 VirtIO Block Device
Type: Hard Disk
Capacity: 13824.0 MB = 13.5 GB (28311552 x 512)
... is now current device
Scanning virtio 0:1...
Found /boot/extlinux/extlinux.conf
Retrieving file: /boot/extlinux/extlinux.conf
755 bytes read in 4 ms (183.6 KiB/s)
U-Boot menu
1:  Ubuntu 20.04.3 LTS 5.11.0-1017-generic
2:  Ubuntu 20.04.3 LTS 5.11.0-1017-generic (rescue target)
Enter choice: 1:Ubuntu 20.04.3 LTS 5.11.0-1017-generic
Retrieving file: /boot/initrd.img-5.11.0-1017-generic
170008555 bytes read in 728 ms (222.7 MiB/s)
Retrieving file: /boot/vmlinuz-5.11.0-1017-generic
25258496 bytes read in 148 ms (162.8 MiB/s)
append: root=LABEL=cloudimg-rootfs ro earlycon
Retrieving file: /lib/firmware/5.11.0-1017-generic/device-tree/qemu-riscv.dtb
Failed to load '/lib/firmware/5.11.0-1017-generic/device-tree/qemu-riscv.dtb'
Skipping l0 for failure retrieving fdt
2:  Ubuntu 20.04.3 LTS 5.11.0-1017-generic (rescue target)
Retrieving file: /boot/initrd.img-5.11.0-1017-generic
170008555 bytes read in 44 ms (3.6 GiB/s)
Retrieving file: /boot/vmlinuz-5.11.0-1017-generic
25258496 bytes read in 12 ms (2 GiB/s)
append: root=LABEL=cloudimg-rootfs ro earlycon single
Retrieving file: /lib/firmware/5.11.0-1017-generic/device-tree/qemu-riscv.dtb
Failed to load '/lib/firmware/5.11.0-1017-generic/device-tree/qemu-riscv.dtb'
Skipping l0r for failure retrieving fdt
SCRIPT FAILED: continuing...
libfdt fdt_check_header(): FDT_ERR_BADMAGIC
Scanning disk virtio-blk#8...
** Unrecognized filesystem type **
** Unrecognized filesystem type **
Found 6 disks
** Invalid partition 21 **
Cannot read EFI system partition
BootOrder not defined
EFI boot manager: Cannot load any image
Scanning virtio 0:f...
** Unable to read file / **
Failed to load '/'
libfdt fdt_check_header(): FDT_ERR_BADMAGIC
BootOrder not defined
EFI boot manager: Cannot load any image
scanning bus for devices...

Device 0: unknown device
BOOTP broadcast 1
DHCP client bound to address 10.0.2.15 (4 ms)
Using virtio-net#0 device
TFTP from server 10.0.2.2; our IP address is 10.0.2.15
Filename 'boot.scr.uimg'.
Load address: 0x8810
Loading: *
TFTP error: 'Access violation' (2)
Not retrying...
BOOTP broadcast 1
DHCP client bound to address 10.0.2.15 (1 ms)
Using virtio-net#0 device
TFTP from server 10.0.2.2; our IP address is 10.0.2.15
Filename 'boot.scr.uimg'.
Load address: 0x8400
Loading: *
TFTP error: 'Access violation' (2)
Not retrying...
=>




Quoting Hoa Nguyen via gem5-users :


Hi,

It also took 6+ hours to boot Linux using that disk image on my end, so I
think it's systemd being very slow.

By profiling systemd, I find that systemd took a lot of time to bring up
the networking