[gem5-users] Running DL/ML workloads in gem5

2021-08-24 Thread Da Zhang via gem5-users
Hi,

I tried to run some DL/ML workloads(e.g., DLRM) in KVM gem5 to make some
checkpoints for future researches. However, the workloads hanged without
any error messages when running in KVM gem5. Any suggestions? Or anyone has
experience running DL/ML workloads in gem5?

Thank you very much in advance!

best,
Da Zhang
___
gem5-users mailing list -- gem5-users@gem5.org
To unsubscribe send an email to gem5-users-le...@gem5.org
%(web_page_url)slistinfo%(cgiext)s/%(_internal_name)s

[gem5-users] Is AVX supported in gem5?

2020-01-22 Thread Da Zhang
Hey guys,

Is AVX supported by the atomic CPU and/or O3 CPU in gem5? If NOT, what will
happen if a benchmark compiles with vectorization and run in X86 full
system mode?

The thing is that we have created X86 disk images for running NPB benchmark
suite. In the disk images, we compiled NPB in qemu with kvm. We have
confirmed that many loops in NPB were vectorized by checking gfortran/gcc
optimization information. So far, we have created many checkpoints with the
NPB disk image and run thousands of experiments without obvious problems
regarding AVX support. Since I don't recall that gem5 supports AVX, I am
wondering that what was happened? (Additional info: almost every vectorized
loop in NPB, gfortran/gcc also print "loop turned into non-loop; it never
loops" after the vectorization, does anyone have some hints about this?)

Thanks in advance.

best,
Da Zhang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Does gem5 ARM support /dev/vda ?

2019-04-16 Thread Da Zhang
Hey guys,

I am trying to create an Ubuntu ARM disk image for aarch64 architecture.
However, when I run gem5 with kvm enabled, it always complains for "end
Kernel panic - not syncing: No working init found.  Try passing init=
option to kernel. See Linux Documentation/admin-guide/init.rst for
guidance".

I created the disk image using qemu:
qemu-system-aarch64 \
-cpu host \
-device virtio-scsi-device \
-device scsi-cd,drive=cdrom \
-device virtio-blk-device,drive=hd0 \
-drive
"file=ubuntu-16.04.5-server-arm64.iso,id=cdrom,if=none,media=cdrom" \
-drive
"if=none,format=qcow2,file=ubuntu-16.04.5-server-arm64.img,id=hd0" \
-pflash "ubuntu-16.04.5-server-arm64-flash0.img" \
-pflash "ubuntu-16.04.5-server-arm64-flash1.img" \
-m 1G \
-machine virt \
-enable-kvm \
-nographic \
-net virtio-net-device,netdev=n1 \
-netdev user,id=n1 \
;

Convert the qcow2 image to raw format and make sure it works (e.g., reboot
the raw format image via qemu-system-aarch64)

Make m5 using:
cd gem5/util/m5
make -f Makefile.arm

scp m5 to the guest arm system under /sbin/. Create a link /sbin/gem5 point
to /sbin/m5 on the guest system. Then I setup the gem5.service and create
gem5init as shown in Jason's guide "
http://www.lowepower.com/jason/setting-up-gem5-full-system.html;. I also
remove some conf files from the guest arm linux system's /etc/init folder
as described on http://gem5.org/Ubuntu_Disk_Image_for_ARM_Full_System.
However, I didn't follow this guide since both host and guest arm linux
systems I used are quite different.

Build the arm linux kernel:
git clone https://gem5.googlesource.com/arm/linux
git checkout -b gem5/v4.4
make ARCH=arm gem5_defconfig
make ARCH=arm -j `nproc`
I didn't use a cross-platform compiler since I built the kernel on an arm
board.

Then I tried to run the command:
"build/ARM/gem5.fast --listener-mode=on -d /home/rock64/tmp/arm_
configs/example/fs.py --machine-type=VExpress_GEM5_V1
--kernel=vmlinux.arm.4.4 --disk-image=ubuntu-16.04.5-server-arm64.img
--mem-size=1GB --cpu-type=ArmV8KvmCPU --generate-dtb -n 1".

It gave me a panic:
   2.641819] gem5 DVFS handler is disabled
[2.641822] gem5-energy-ctrl loaded at ff800899e000
[2.641825] gem5_energy_ctrl_mc: gem5_mc_init: DVFS handler in energy
controller is disabled, ARM gem5 multi-cluster
cpufreq driver will not be registered
[2.642190] usbcore: registered new interface driver usbhid
[2.642191] usbhid: USB HID core driver
[2.642322] NET: Registered protocol family 17
[2.916496] scsi 0:0:0:0: Direct-Access ATA  M5 IDE Disk
n/a  PQ: 0 ANSI: 5
[2.935127] sd 0:0:0:0: [sda] 16777216 512-byte logical blocks: (8.59
GB/8.00 GiB)
[2.952327] sd 0:0:0:0: Attached scsi generic sg0 type 0
[2.964648] sd 0:0:0:0: [sda] Write Protect is off
[2.975518] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
[2.986957] sd 0:0:0:0: [sda] Write cache: disabled, read cache:
enabled, doesn't support DPO or FUA
[3.101444]  sda: sda1 sda2 sda3
[3.109804] sd 0:0:0:0: [sda] Attached SCSI disk
[3.452824] VFS: Mounted root (vfat filesystem) on device 8:1.
[3.466085] devtmpfs: error mounting -2
[3.475091] Freeing unused kernel memory: 384K
[3.485519] Kernel panic - not syncing: No working init found.  Try
passing init= option to kernel. See Linux
Documentation/admin-guide/init.rst for guidance.
[3.516805] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 4.14.0+ #2
[3.530257] Hardware name: V2P-CA15 (DT)
[3.539161] Call trace:
[3.544930] [] dump_backtrace+0x0/0x370
[3.557065] [] show_stack+0x14/0x20
[3.568513] [] dump_stack+0x8c/0xac
[3.579903] [] panic+0x11c/0x274
[3.590712] [] kernel_init+0xec/0x100
[3.602463] [] ret_from_fork+0x10/0x18
[3.614433] Kernel Offset: disabled
[3.622386] CPU features: 0x002000
[3.630153] Memory Limit: 1024 MB
[3.637726] ---[ end Kernel panic - not syncing: No working init found.
Try passing init= option to kernel. See Linux
Documentation/admin-guide/init.rst for guidance.


One difference I do notice is that the arm disk image I created use
/dev/vda, but both the arm image downloaded from gem5 website or our x86
disk image use /dev/sda. So, I wonder if this is the problem? Or, something
else is wrong?

Thanks in advance.

best,
Da Zhang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Any guides for preparing and running gem5 for ARM with KVM support

2019-02-27 Thread Da Zhang
Hey guys,

Are there any guides for preparing and running gem5 for ARM with KVM
support? Something like this "
http://www.lowepower.com/jason/setting-up-gem5-full-system.html; but for
ARM?
We have a ROCKPro64 board with a cortex-a53 cpu. I tried the ARM disk image
and kernels (kernel: "vmlinux.vexpress_gem5_v1_64" and machine:
"VExpress_GEM5_V1") from gem5 website with ArmV8KvmCPU, but it hanged after
"[   15.648539] Buffer I/O error on dev sda1, logical block 524278, async
page read"  with many "Illegal instruction" (I was able to login the system
with AtomicSimpleCPU, it show similar terminal output without "Illegal
instruction"). I am new to ARM and get more troubles when I tried to create
an ARM disk image with qemu.

Thanks a lot in advance.

best,
Da Zhang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] ARM simulation (AtomicSimpleCPU and fs mode) only show 1 cpu

2019-01-17 Thread Da Zhang
Hey guys,

I am using gem5 to simulate ARM system in fs mode with AtomicSimpleCPU now,
in order to make some checkpoints. However, inside the system, it only
shown 1 cpu with lscpu while I used -n 2 to simulate 2 cpu.

My configurations are:
executable: build/ARM/gem5.fast
python script: configs/example/fs.py
image: aarch64-ubuntu-trusty-headless.img
kernel: vmlinux.vexpress_gem5_v1_64
dtb: armv7_gem5_v1_2cpu.dtb
-n: 2
machine-type: VExpress_GEM5_V1
mem-size: 4GB
sys-clock: 3GHz
cpu-type: AtomicSimpleCPU
--checkpoint-at-end

I am trying to run 2 instances of the benchmark on a 2 cpu arm system. But
with either lscpu or cat /proc/cpuinfo, it only shows one cpu. Any clue or
suggestions?

Thanks in advance.

Best,
Da ZHang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] dacapo (java) benchmark suite encounters "SIGSEGV" and "null exception" during timing mode (fs mode) after restarting from a checkpoint

2018-07-19 Thread Da Zhang
I just did a quick test for one benchmark using sync and no COW layer, but
it still encountered SIGSEGV. I took the checkpoint by running the
benchmark in the background as root (with KVM CPU); I warmed up JIT for 1
round and took the checkpoint in the second round by using "sync && m5
exit" with --checkpoint-at-end.
These are two SIGSEGVs (same checkpoint with different fast forward time):
1.

#  SIGSEGV (0xb) at pc=0x7f8b592c4b80, pid=1482, tid=0x7f8b50408700





#





# JRE version: Java(TM) SE Runtime Environment (8.0_171-b11) (build
1.8.0_171-b11)




# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode
linux-amd64 compressed oops)




# Problematic frame:





# J 2671 C2
java.math.MutableBigInteger.divideMagnitude(Ljava/math/MutableBigInteger;Ljava/math/MutableBigInteger;Z)Ljava/math/MutableBigInteger;
(1307 bytes) @ 0x7f8b592c4b80 [0x7f8b592c4700+0x480]
2.

#  SIGSEGV (0xb) at pc=0x7f8b6e6b02e4, pid=1482, tid=0x7f8b50408700





#





# JRE version: Java(TM) SE Runtime Environment (8.0_171-b11) (build
1.8.0_171-b11)




# Java VM: Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode
linux-amd64 compressed oops)




# Problematic frame:





# V  [libjvm.so+0x5952e4]  frame::sender(RegisterMap*) const+0x114



On Thu, Jul 19, 2018 at 3:12 PM Gutierrez, Anthony <
anthony.gutier...@amd.com> wrote:

> Yes, make sure all buffers are flushed, etc., before taking your
> checkpoint you can call the “sync” command, which should be already
> installed on the image. You’ll need to call sync before your commands to
> halt and take a checkpoint.
>
>
>
> This page explains how I did the same for an Android disk image:
> http://gem5.org/BBench-gem5#Tips_for_Making_Your_Disk_Image_gem5_Friendly
>
>
>
> -Tony
>
>
>
> *From:* gem5-users  *On Behalf Of *Da Zhang
> *Sent:* Thursday, July 19, 2018 12:00 PM
> *To:* gem5 users mailing list 
> *Subject:* Re: [gem5-users] dacapo (java) benchmark suite encounters
> "SIGSEGV" and "null exception" during timing mode (fs mode) after
> restarting from a checkpoint
>
>
>
> Hey Gutierrez,
>
>
>
> "*sync* the disk image", do you mean making sure all disk modifications
> are actually made on the disk (update to date) before taking the
> checkpoint? How to do that?
>
> I haven't tried to take a checkpoint with COW layer disabled and then
> restart from that checkpoint before. All I have done is "ctrl+c" to stop
> gem5 to take the checkpoint (--checkpoint-at-end); I rely on gem5 to take
> care of all things that need to be checked when taking checkpoints.
>
>
>
> Best,
>
> Da Zhang
>
>
>
> On Thu, Jul 19, 2018 at 2:36 PM Gutierrez, Anthony <
> anthony.gutier...@amd.com> wrote:
>
> JIT was precisely the issue I was thinking was causing this. One thing may
> be necessary, that is to ensure you *sync* the disk image before taking
> your checkpoint.
>
>
>
> gem5’s debug flags should help you identify something like a hang, for
> example an ExecAll trace. A SyscallAll trace would most likely help you
> understand better what the JIT is doing.
>
>
>
> *From:* gem5-users  *On Behalf Of *Da Zhang
> *Sent:* Thursday, July 19, 2018 11:15 AM
> *To:* gem5 users mailing list 
> *Subject:* Re: [gem5-users] dacapo (java) benchmark suite encounters
> "SIGSEGV" and "null exception" during timing mode (fs mode) after
> restarting from a checkpoint
>
>
>
> Thanks for the suggestions.
>
> I have been trying a couple of solutions (I only test for  a small subset
> of decapo benchmark suite, which encounters segfault with O3CPU):
>
>
>
> 1. using TimingSimpleCPU: no segfaults
>
> 2. disable COW layer and write on the disk image when taking checkpoint:
> there are still segfaults
>
> 3. take checkpoints with JIT compiler disabled (20x slowdown): no segfaults
>
> 4. take checkpoints during atomic mode (without warming up JIT): no
> segfaults
>
> 5. take checkpoints with Java OOPs compress disabled: there are still
> segfaults
>
>
>
> One thing that I can't tell is if the benchmark hangs since there is no
> printing during the execution. Is there a statistic I can use to tell if
> the benchmark hangs?
>
>
>
> So far, all my experiments are running using 1CPU (even some benchmarks
> are multithreading). I attempted to take some checkpoints with more CPUs
> with KVM CPU. But unfortunately, I got some "rcu_sched self-detected stall
> on CPU" issues. Any idea?
>
>
>
> On Mon, Jul 16, 2018 at 5:47 PM Gutierrez, Anthony <
> anthony.gutier...@amd.com> wrote:
>
> Da,
>
>
>
> Do you encounter the segfault only when restoring from a ch

Re: [gem5-users] dacapo (java) benchmark suite encounters "SIGSEGV" and "null exception" during timing mode (fs mode) after restarting from a checkpoint

2018-07-19 Thread Da Zhang
Thanks a lot for the tips. I will give a try.

best,
Da

On Thu, Jul 19, 2018 at 3:12 PM Gutierrez, Anthony <
anthony.gutier...@amd.com> wrote:

> Yes, make sure all buffers are flushed, etc., before taking your
> checkpoint you can call the “sync” command, which should be already
> installed on the image. You’ll need to call sync before your commands to
> halt and take a checkpoint.
>
>
>
> This page explains how I did the same for an Android disk image:
> http://gem5.org/BBench-gem5#Tips_for_Making_Your_Disk_Image_gem5_Friendly
>
>
>
> -Tony
>
>
>
> *From:* gem5-users  *On Behalf Of *Da Zhang
> *Sent:* Thursday, July 19, 2018 12:00 PM
> *To:* gem5 users mailing list 
> *Subject:* Re: [gem5-users] dacapo (java) benchmark suite encounters
> "SIGSEGV" and "null exception" during timing mode (fs mode) after
> restarting from a checkpoint
>
>
>
> Hey Gutierrez,
>
>
>
> "*sync* the disk image", do you mean making sure all disk modifications
> are actually made on the disk (update to date) before taking the
> checkpoint? How to do that?
>
> I haven't tried to take a checkpoint with COW layer disabled and then
> restart from that checkpoint before. All I have done is "ctrl+c" to stop
> gem5 to take the checkpoint (--checkpoint-at-end); I rely on gem5 to take
> care of all things that need to be checked when taking checkpoints.
>
>
>
> Best,
>
> Da Zhang
>
>
>
> On Thu, Jul 19, 2018 at 2:36 PM Gutierrez, Anthony <
> anthony.gutier...@amd.com> wrote:
>
> JIT was precisely the issue I was thinking was causing this. One thing may
> be necessary, that is to ensure you *sync* the disk image before taking
> your checkpoint.
>
>
>
> gem5’s debug flags should help you identify something like a hang, for
> example an ExecAll trace. A SyscallAll trace would most likely help you
> understand better what the JIT is doing.
>
>
>
> *From:* gem5-users  *On Behalf Of *Da Zhang
> *Sent:* Thursday, July 19, 2018 11:15 AM
> *To:* gem5 users mailing list 
> *Subject:* Re: [gem5-users] dacapo (java) benchmark suite encounters
> "SIGSEGV" and "null exception" during timing mode (fs mode) after
> restarting from a checkpoint
>
>
>
> Thanks for the suggestions.
>
> I have been trying a couple of solutions (I only test for  a small subset
> of decapo benchmark suite, which encounters segfault with O3CPU):
>
>
>
> 1. using TimingSimpleCPU: no segfaults
>
> 2. disable COW layer and write on the disk image when taking checkpoint:
> there are still segfaults
>
> 3. take checkpoints with JIT compiler disabled (20x slowdown): no segfaults
>
> 4. take checkpoints during atomic mode (without warming up JIT): no
> segfaults
>
> 5. take checkpoints with Java OOPs compress disabled: there are still
> segfaults
>
>
>
> One thing that I can't tell is if the benchmark hangs since there is no
> printing during the execution. Is there a statistic I can use to tell if
> the benchmark hangs?
>
>
>
> So far, all my experiments are running using 1CPU (even some benchmarks
> are multithreading). I attempted to take some checkpoints with more CPUs
> with KVM CPU. But unfortunately, I got some "rcu_sched self-detected stall
> on CPU" issues. Any idea?
>
>
>
> On Mon, Jul 16, 2018 at 5:47 PM Gutierrez, Anthony <
> anthony.gutier...@amd.com> wrote:
>
> Da,
>
>
>
> Do you encounter the segfault only when restoring from a checkpoint? That
> is, if you do not use checkpoints can any DaCapo benchmark successfully
> complete under one of the simple CPU models (and not just KVM CPU)?
>
>
>
> If so, you may want to get a syscall trace (e.g., using strace) to see
> what sorts of files the JVM is trying to read etc. It’s possible that the
> VM generates some files that it will read back later. If you use
> checkpoints, due to the disk image COW layer, I do not believe any disk
> updates are checkpointed, thus these files will not persist, which could
> lead to some weird segfault issues. Not sure if this is happening in your
> case, but it may be worth investigating.
>
>
>
> I created some of the original Android disk images, and the original
> DaCapo image, and at that time I would typically run the benchmarks thru
> the FS mode and Atomic CPU once, with the COW layer disabled, in order to
> generate the needed files on the disk image and have them persist. This was
> entirely for performance, however, to prevent the VMs from regenerating the
> same files for each run, but I can envision it causing issues during
> runtime as well. In particular, it seems you’re code is faulting while

Re: [gem5-users] dacapo (java) benchmark suite encounters "SIGSEGV" and "null exception" during timing mode (fs mode) after restarting from a checkpoint

2018-07-19 Thread Da Zhang
Hey Gutierrez,

"*sync* the disk image", do you mean making sure all disk modifications are
actually made on the disk (update to date) before taking the checkpoint?
How to do that?
I haven't tried to take a checkpoint with COW layer disabled and then
restart from that checkpoint before. All I have done is "ctrl+c" to stop
gem5 to take the checkpoint (--checkpoint-at-end); I rely on gem5 to take
care of all things that need to be checked when taking checkpoints.

Best,
Da Zhang

On Thu, Jul 19, 2018 at 2:36 PM Gutierrez, Anthony <
anthony.gutier...@amd.com> wrote:

> JIT was precisely the issue I was thinking was causing this. One thing may
> be necessary, that is to ensure you *sync* the disk image before taking
> your checkpoint.
>
>
>
> gem5’s debug flags should help you identify something like a hang, for
> example an ExecAll trace. A SyscallAll trace would most likely help you
> understand better what the JIT is doing.
>
>
>
> *From:* gem5-users  *On Behalf Of *Da Zhang
> *Sent:* Thursday, July 19, 2018 11:15 AM
> *To:* gem5 users mailing list 
> *Subject:* Re: [gem5-users] dacapo (java) benchmark suite encounters
> "SIGSEGV" and "null exception" during timing mode (fs mode) after
> restarting from a checkpoint
>
>
>
> Thanks for the suggestions.
>
> I have been trying a couple of solutions (I only test for  a small subset
> of decapo benchmark suite, which encounters segfault with O3CPU):
>
>
>
> 1. using TimingSimpleCPU: no segfaults
>
> 2. disable COW layer and write on the disk image when taking checkpoint:
> there are still segfaults
>
> 3. take checkpoints with JIT compiler disabled (20x slowdown): no segfaults
>
> 4. take checkpoints during atomic mode (without warming up JIT): no
> segfaults
>
> 5. take checkpoints with Java OOPs compress disabled: there are still
> segfaults
>
>
>
> One thing that I can't tell is if the benchmark hangs since there is no
> printing during the execution. Is there a statistic I can use to tell if
> the benchmark hangs?
>
>
>
> So far, all my experiments are running using 1CPU (even some benchmarks
> are multithreading). I attempted to take some checkpoints with more CPUs
> with KVM CPU. But unfortunately, I got some "rcu_sched self-detected stall
> on CPU" issues. Any idea?
>
>
>
> On Mon, Jul 16, 2018 at 5:47 PM Gutierrez, Anthony <
> anthony.gutier...@amd.com> wrote:
>
> Da,
>
>
>
> Do you encounter the segfault only when restoring from a checkpoint? That
> is, if you do not use checkpoints can any DaCapo benchmark successfully
> complete under one of the simple CPU models (and not just KVM CPU)?
>
>
>
> If so, you may want to get a syscall trace (e.g., using strace) to see
> what sorts of files the JVM is trying to read etc. It’s possible that the
> VM generates some files that it will read back later. If you use
> checkpoints, due to the disk image COW layer, I do not believe any disk
> updates are checkpointed, thus these files will not persist, which could
> lead to some weird segfault issues. Not sure if this is happening in your
> case, but it may be worth investigating.
>
>
>
> I created some of the original Android disk images, and the original
> DaCapo image, and at that time I would typically run the benchmarks thru
> the FS mode and Atomic CPU once, with the COW layer disabled, in order to
> generate the needed files on the disk image and have them persist. This was
> entirely for performance, however, to prevent the VMs from regenerating the
> same files for each run, but I can envision it causing issues during
> runtime as well. In particular, it seems you’re code is faulting while
> doing some XML serializing/deserializing, perhaps the xml file it is
> looking for is gone?
>
>
>
> Beyond that, assuming it is a real bug in gem5, I would recommend an
> ExecAll trace to figure out why the instruction at that PC is faulting.
>
>
>
> -Tony
>
>
>
> *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Da
> Zhang
> *Sent:* Monday, July 16, 2018 1:50 PM
> *To:* gem5 users mailing list 
> *Subject:* Re: [gem5-users] dacapo (java) benchmark suite encounters
> "SIGSEGV" and "null exception" during timing mode (fs mode) after
> restarting from a checkpoint
>
>
>
> Hey Jason,
>
>
>
> There are a bunch of "warn: instruction 'prefetch_nta' unimplemented" in
> atomic modes, during which the java benchmarks don't crash. However, there
> is no these kind of warnings during timing mode. Does it imply that
> unimplemented instructions don't cause the problem? Any clues or
> suggestions to debug these p

Re: [gem5-users] dacapo (java) benchmark suite encounters "SIGSEGV" and "null exception" during timing mode (fs mode) after restarting from a checkpoint

2018-07-19 Thread Da Zhang
Thanks for the suggestions.
I have been trying a couple of solutions (I only test for  a small subset
of decapo benchmark suite, which encounters segfault with O3CPU):

1. using TimingSimpleCPU: no segfaults
2. disable COW layer and write on the disk image when taking checkpoint:
there are still segfaults
3. take checkpoints with JIT compiler disabled (20x slowdown): no segfaults
4. take checkpoints during atomic mode (without warming up JIT): no
segfaults
5. take checkpoints with Java OOPs compress disabled: there are still
segfaults

One thing that I can't tell is if the benchmark hangs since there is no
printing during the execution. Is there a statistic I can use to tell if
the benchmark hangs?

So far, all my experiments are running using 1CPU (even some benchmarks are
multithreading). I attempted to take some checkpoints with more CPUs with
KVM CPU. But unfortunately, I got some "rcu_sched self-detected stall on
CPU" issues. Any idea?

On Mon, Jul 16, 2018 at 5:47 PM Gutierrez, Anthony <
anthony.gutier...@amd.com> wrote:

> Da,
>
>
>
> Do you encounter the segfault only when restoring from a checkpoint? That
> is, if you do not use checkpoints can any DaCapo benchmark successfully
> complete under one of the simple CPU models (and not just KVM CPU)?
>
>
>
> If so, you may want to get a syscall trace (e.g., using strace) to see
> what sorts of files the JVM is trying to read etc. It’s possible that the
> VM generates some files that it will read back later. If you use
> checkpoints, due to the disk image COW layer, I do not believe any disk
> updates are checkpointed, thus these files will not persist, which could
> lead to some weird segfault issues. Not sure if this is happening in your
> case, but it may be worth investigating.
>
>
>
> I created some of the original Android disk images, and the original
> DaCapo image, and at that time I would typically run the benchmarks thru
> the FS mode and Atomic CPU once, with the COW layer disabled, in order to
> generate the needed files on the disk image and have them persist. This was
> entirely for performance, however, to prevent the VMs from regenerating the
> same files for each run, but I can envision it causing issues during
> runtime as well. In particular, it seems you’re code is faulting while
> doing some XML serializing/deserializing, perhaps the xml file it is
> looking for is gone?
>
>
>
> Beyond that, assuming it is a real bug in gem5, I would recommend an
> ExecAll trace to figure out why the instruction at that PC is faulting.
>
>
>
> -Tony
>
>
>
> *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Da
> Zhang
> *Sent:* Monday, July 16, 2018 1:50 PM
> *To:* gem5 users mailing list 
> *Subject:* Re: [gem5-users] dacapo (java) benchmark suite encounters
> "SIGSEGV" and "null exception" during timing mode (fs mode) after
> restarting from a checkpoint
>
>
>
> Hey Jason,
>
>
>
> There are a bunch of "warn: instruction 'prefetch_nta' unimplemented" in
> atomic modes, during which the java benchmarks don't crash. However, there
> is no these kind of warnings during timing mode. Does it imply that
> unimplemented instructions don't cause the problem? Any clues or
> suggestions to debug these problems?
>
>
>
> best,
>
> Da Zhang
>
>
>
>
>
>
>
> On Mon, Jul 16, 2018 at 1:32 PM Jason Lowe-Power 
> wrote:
>
> Hello,
>
>
>
> Are you seeing any warnings like "warn: Instruction XXX not implemented"?
>
>
>
> There are many X86 SIMD instructions that are currently unimplemented. I
> would bet that your application is using some of those instructions and
> getting 0's as the output instead of the correct value.
>
>
>
> The "right" way to solve this problem is to implement these instructions
> (and we would really appreciate it if you contribute your fixes back on
> https://gem5-review.googlesource.com. The other option is to recompile
> your applications without SIMD extensions (e.g., -march=athlon64 or
> whatever is the original x86-64 name in GCC). However, this likely requires
> compiling all of the java runtime in your case.
>
>
>
> Cheers,
>
> Jason
>
>
>
> On Mon, Jul 16, 2018 at 10:11 AM Da Zhang  wrote:
>
> To clarify, "SIGSEGV and null exceptions " happens to the benchmark
> suite, not gem5. Gem5 is running without errors. But in the
> system.pc.com_1.device files, I observe that most of the benchmarks crash
> due to SIGSEGV or null exceptions.
>
> Example:
>
> "
>
>  x/system.pc.com_1.device
>
>
>
>   buffers
>
>   1 #
>
>   2 # A fatal error has been detected by the Java 

Re: [gem5-users] dacapo (java) benchmark suite encounters "SIGSEGV" and "null exception" during timing mode (fs mode) after restarting from a checkpoint

2018-07-16 Thread Da Zhang
Hey Jason,

There are a bunch of "warn: instruction 'prefetch_nta' unimplemented" in
atomic modes, during which the java benchmarks don't crash. However, there
is no these kind of warnings during timing mode. Does it imply that
unimplemented instructions don't cause the problem? Any clues or
suggestions to debug these problems?

best,
Da Zhang



On Mon, Jul 16, 2018 at 1:32 PM Jason Lowe-Power 
wrote:

> Hello,
>
> Are you seeing any warnings like "warn: Instruction XXX not implemented"?
>
> There are many X86 SIMD instructions that are currently unimplemented. I
> would bet that your application is using some of those instructions and
> getting 0's as the output instead of the correct value.
>
> The "right" way to solve this problem is to implement these instructions
> (and we would really appreciate it if you contribute your fixes back on
> https://gem5-review.googlesource.com. The other option is to recompile
> your applications without SIMD extensions (e.g., -march=athlon64 or
> whatever is the original x86-64 name in GCC). However, this likely requires
> compiling all of the java runtime in your case.
>
> Cheers,
> Jason
>
> On Mon, Jul 16, 2018 at 10:11 AM Da Zhang  wrote:
>
>> To clarify, "SIGSEGV and null exceptions " happens to the benchmark
>> suite, not gem5. Gem5 is running without errors. But in the
>> system.pc.com_1.device files, I observe that most of the benchmarks crash
>> due to SIGSEGV or null exceptions.
>> Example:
>> "
>>
>>  x/system.pc.com_1.device
>>
>>
>>
>>   buffers
>>
>>   1 #
>>
>>   2 # A fatal error has been detected by the Java Runtime Environment:
>>
>>   3 #
>>
>>   4 #  SIGSEGV (0xb) at pc=0x7f81d17742b7, pid=1474,
>> tid=0x7f81cf46d700
>>
>>   5 #
>>
>>   6 # JRE version: Java(TM) SE Runtime Environment (8.0_171-b11) (build
>> 1.8.0_171-b11)
>>
>>   7 # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode
>> linux-amd64 compressed oops)
>>
>>   8 # Problematic frame:
>>
>>   9 # J 1815 C2
>> org.apache.xml.serializer.ToHTMLStream.endElement(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)V
>> (389 bytes) @ 0x7f81d17742b7 [0x7f81d1774280+0x37]
>>
>>
>>
>>  10 #
>>
>>  11 #
>> "
>>
>> On Mon, Jul 16, 2018 at 11:39 AM Da Zhang  wrote:
>>
>>> Hey guys,
>>>
>>> I am testing a java benchmark suite, dacapo, on gem5 with fs mode.
>>> Unfortunately, I encounter a lot of  SIGSEGV and null exceptions during
>>> timing mode after restarting from the checkpoints.
>>> I am using linux kernel v4.8.13 and ubuntu-server-16.04.1 with
>>> oracle jdk v8.0_171-b11. To eliminate the influence of my modifications to
>>> gem5 src/ and configs/, I re-download gem5 and checkout to commit
>>> "ee2ffdc0fdb489767768e5273a4ccd7b51735c7c", which is the gem5 version I am
>>> working on. The checkpoint was taken by using kvm cpu with 1 CPU and 16GB
>>> memory. For the simulation, I use build/X86/gem5.opt (in order to enable
>>> assertions) with fs mode (configs/example/fs.py). Other options include
>>> "--cpu-type=DerivO3CPU -n 1 --mem-size=16GB --caches --l2cache
>>> --l2_size=${L2SIZE}" (I try L2SIZE from 256KB to 8MB). I test with 100ms
>>> warmup and 1ps real simulation time. There are no errors presented. But
>>> with longer real simulation time, the benchmark suite crashes with
>>> segfault.
>>> I am able to run the dacapo benchmark suite in fs mode with kvm cpu,
>>> without any segfaults or exceptions. I have some simple java benchmarks
>>> tested; neither segfaults nor exceptions present.
>>> Does anyone have suggestions or experience against these issues?
>>>
>>> best,
>>> Da Zhang
>>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] dacapo (java) benchmark suite encounters "SIGSEGV" and "null exception" during timing mode (fs mode) after restarting from a checkpoint

2018-07-16 Thread Da Zhang
To clarify, "SIGSEGV and null exceptions " happens to the benchmark suite,
not gem5. Gem5 is running without errors. But in the system.pc.com_1.device
files, I observe that most of the benchmarks crash due to SIGSEGV or null
exceptions.
Example:
"

 x/system.pc.com_1.device



buffers

  1 #

  2 # A fatal error has been detected by the Java Runtime Environment:

  3 #

  4 #  SIGSEGV (0xb) at pc=0x7f81d17742b7, pid=1474,
tid=0x7f81cf46d700

  5 #

  6 # JRE version: Java(TM) SE Runtime Environment (8.0_171-b11) (build
1.8.0_171-b11)

  7 # Java VM: Java HotSpot(TM) 64-Bit Server VM (25.171-b11 mixed mode
linux-amd64 compressed oops)

  8 # Problematic frame:

  9 # J 1815 C2
org.apache.xml.serializer.ToHTMLStream.endElement(Ljava/lang/String;Ljava/lang/String;Ljava/lang/String;)V
(389 bytes) @ 0x7f81d17742b7 [0x7f81d1774280+0x37]



 10 #

 11 #
"

On Mon, Jul 16, 2018 at 11:39 AM Da Zhang  wrote:

> Hey guys,
>
> I am testing a java benchmark suite, dacapo, on gem5 with fs mode.
> Unfortunately, I encounter a lot of  SIGSEGV and null exceptions during
> timing mode after restarting from the checkpoints.
> I am using linux kernel v4.8.13 and ubuntu-server-16.04.1 with
> oracle jdk v8.0_171-b11. To eliminate the influence of my modifications to
> gem5 src/ and configs/, I re-download gem5 and checkout to commit
> "ee2ffdc0fdb489767768e5273a4ccd7b51735c7c", which is the gem5 version I am
> working on. The checkpoint was taken by using kvm cpu with 1 CPU and 16GB
> memory. For the simulation, I use build/X86/gem5.opt (in order to enable
> assertions) with fs mode (configs/example/fs.py). Other options include
> "--cpu-type=DerivO3CPU -n 1 --mem-size=16GB --caches --l2cache
> --l2_size=${L2SIZE}" (I try L2SIZE from 256KB to 8MB). I test with 100ms
> warmup and 1ps real simulation time. There are no errors presented. But
> with longer real simulation time, the benchmark suite crashes with
> segfault.
> I am able to run the dacapo benchmark suite in fs mode with kvm cpu,
> without any segfaults or exceptions. I have some simple java benchmarks
> tested; neither segfaults nor exceptions present.
> Does anyone have suggestions or experience against these issues?
>
> best,
> Da Zhang
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] dacapo (java) benchmark suite encounters "SIGSEGV" and "null exception" during timing mode (fs mode) after restarting from a checkpoint

2018-07-16 Thread Da Zhang
Hey guys,

I am testing a java benchmark suite, dacapo, on gem5 with fs mode.
Unfortunately, I encounter a lot of  SIGSEGV and null exceptions during
timing mode after restarting from the checkpoints.
I am using linux kernel v4.8.13 and ubuntu-server-16.04.1 with
oracle jdk v8.0_171-b11. To eliminate the influence of my modifications to
gem5 src/ and configs/, I re-download gem5 and checkout to commit
"ee2ffdc0fdb489767768e5273a4ccd7b51735c7c", which is the gem5 version I am
working on. The checkpoint was taken by using kvm cpu with 1 CPU and 16GB
memory. For the simulation, I use build/X86/gem5.opt (in order to enable
assertions) with fs mode (configs/example/fs.py). Other options include
"--cpu-type=DerivO3CPU -n 1 --mem-size=16GB --caches --l2cache
--l2_size=${L2SIZE}" (I try L2SIZE from 256KB to 8MB). I test with 100ms
warmup and 1ps real simulation time. There are no errors presented. But
with longer real simulation time, the benchmark suite crashes with
segfault.
I am able to run the dacapo benchmark suite in fs mode with kvm cpu,
without any segfaults or exceptions. I have some simple java benchmarks
tested; neither segfaults nor exceptions present.
Does anyone have suggestions or experience against these issues?

best,
Da Zhang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Does gem5 count float multiplications?

2018-06-11 Thread Da Zhang
They are also 0s.

On Sun, Jun 10, 2018 at 7:05 PM, Tariq Azmy  wrote:

> Did you look into SimdFloatMult, SimdFloatDiv, and SimdFloatCmp in the
> stats and see if there are any numbers listed in there? The operation might
> have been executed with sse instruction.
>
> On Sat, Jun 9, 2018 at 4:02 PM, Da Zhang  wrote:
>
>> Hey guys,
>>
>> I am trying to add floating point multiplications (so do floating point
>> dividing and comparing) to the benchmarks as instruments so that I can use
>> the number of floating point multiplications as a counter for the
>> benchmarks during simulation. I see there is FloatMult next to FloatAdd, so
>> I assume it means float multiplication. However, all of my experiments show
>> 0s.
>> Does gem5 capture number of floating point multiplications? Or was I
>> looking at the wrong place in the stats file? I also tried floating
>> point dividing (FloatDiv) and comparing (FloatCmp), but both are always 0s.
>>
>> best,
>> Da
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Does the system configurations matter when taking checkpoints

2018-06-04 Thread Da Zhang
Hey, guys

I have been using kvm cpu to take checkpoints for a while. I use the
default setting for running kvm cpu and modified setting (e.g., caches
latencies) when running real simulation. So far, everything looks fine.
I just wonder if the configurations matter when I take checkpoints? In
other works, do the configurations need to match the configurations used
for simulation? I know the number of CPUs and size of memory need to be the
same.

best,
Da
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Increasing TLB size not working for X86 with O3CPU

2018-05-28 Thread Da Zhang
Hi, Jason

Sorry for my unclear description before. For our workload,
the switch_cpus.dtb's miss rate for 64 tlb entries is 154654 / 1589214 =
9.74%; the miss rate for 1048576 tlb entries is 154360 / 1583757 = 9.73%.
Both are running for 20ms warm up in atomic mode and 2.5ms real simulation
with O3CPU. They are practically identical and very high especially for
1048576 entries with only 1MB heap size.

Any idea or suggestions? Please let me know if other statistics or config
information will be helpful.

best,
Da

On Mon, May 28, 2018 at 12:03 PM, Jason Lowe-Power 
wrote:

> Hi Da,
>
> "For size > 512, the whole stats.txt is identical."
>
> This isn't surprising. 512*4KB = 2MB. So, if your workload is only 1MB
> when you have at least 512 entries you are only seeing compulsory (cold)
> misses. Try running larger workloads and/or workloads with more reuse.
>
> Cheers,
> Jason
>
> On Thu, May 24, 2018 at 9:11 AM Da Zhang  wrote:
>
>> I am using FS mode.
>>
>> On Thu, May 24, 2018 at 12:00 PM, Jason Lowe-Power 
>> wrote:
>>
>>> Hi Da,
>>>
>>> Are you using SE mode or FS mode? IIRC, the TLB size does nothing in SE
>>> mode (it doesn't use a TLB). The TLB is only used in FS mode.
>>>
>>> Jason
>>>
>>> On Thu, May 24, 2018 at 8:45 AM Da Zhang  wrote:
>>>
>>>> Hey guys,
>>>>
>>>> I tried to increase the dtb size (i.e., number of tlb entries) for our
>>>> research. However, the stats.txt for the different dtb size
>>>> (64,128,256,512,1024,2048,1048576) is practical identical or
>>>> identical. For size < 512, the system.switch_cpus.dtb.rdAccesses difference
>>>> is only several hundred. For size > 512, the whole stats.txt is identical.
>>>> I am working for the X86 architecture. I change the size in X86TLB.py to
>>>> increase the dtb size. By checking the config.ini file, I see the size is
>>>> set as expected (under system.cpu.dtb). Any clue?
>>>>
>>>> Thanks in advance.
>>>>
>>>> Best,
>>>> Da
>>>>
>>>>
>>>> ___
>>>> gem5-users mailing list
>>>> gem5-users@gem5.org
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>>
>>> ___
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Increasing TLB size not working for X86 with O3CPU

2018-05-24 Thread Da Zhang
I am using FS mode.

On Thu, May 24, 2018 at 12:00 PM, Jason Lowe-Power <ja...@lowepower.com>
wrote:

> Hi Da,
>
> Are you using SE mode or FS mode? IIRC, the TLB size does nothing in SE
> mode (it doesn't use a TLB). The TLB is only used in FS mode.
>
> Jason
>
> On Thu, May 24, 2018 at 8:45 AM Da Zhang <d...@vt.edu> wrote:
>
>> Hey guys,
>>
>> I tried to increase the dtb size (i.e., number of tlb entries) for our
>> research. However, the stats.txt for the different dtb size
>> (64,128,256,512,1024,2048,1048576) is practical identical or identical.
>> For size < 512, the system.switch_cpus.dtb.rdAccesses difference is only
>> several hundred. For size > 512, the whole stats.txt is identical. I am
>> working for the X86 architecture. I change the size in X86TLB.py to
>> increase the dtb size. By checking the config.ini file, I see the size is
>> set as expected (under system.cpu.dtb). Any clue?
>>
>> Thanks in advance.
>>
>> Best,
>> Da
>>
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] Increasing TLB size not working for X86 with O3CPU

2018-05-24 Thread Da Zhang
More details:

The dtb read miss rate stays at 10%. Our workload is a simple sequential
linked list search microbenchmark with fixed heap size at 1MB. Cache size
is varied from 128KB to 2MB.

On Thu, May 24, 2018 at 11:44 AM, Da Zhang <d...@vt.edu> wrote:

> Hey guys,
>
> I tried to increase the dtb size (i.e., number of tlb entries) for our
> research. However, the stats.txt for the different dtb size
> (64,128,256,512,1024,2048,1048576) is practical identical or identical.
> For size < 512, the system.switch_cpus.dtb.rdAccesses difference is only
> several hundred. For size > 512, the whole stats.txt is identical. I am
> working for the X86 architecture. I change the size in X86TLB.py to
> increase the dtb size. By checking the config.ini file, I see the size is
> set as expected (under system.cpu.dtb). Any clue?
>
> Thanks in advance.
>
> Best,
> Da
>
>
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] Increasing TLB size not working for X86 with O3CPU

2018-05-24 Thread Da Zhang
Hey guys,

I tried to increase the dtb size (i.e., number of tlb entries) for our
research. However, the stats.txt for the different dtb size
(64,128,256,512,1024,2048,1048576) is practical identical or identical. For
size < 512, the system.switch_cpus.dtb.rdAccesses difference is only
several hundred. For size > 512, the whole stats.txt is identical. I am
working for the X86 architecture. I change the size in X86TLB.py to
increase the dtb size. By checking the config.ini file, I see the size is
set as expected (under system.cpu.dtb). Any clue?

Thanks in advance.

Best,
Da
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] gem5 crash with some cpu related error in real simulation after restore from a checkpoint

2018-05-04 Thread Da Zhang
Moreover, it seems only happening to multi-processes benchmarks since that
my other multithreading benchmarks doesn't have this issue so far.

On Fri, May 4, 2018 at 4:37 PM, Da Zhang <d...@vt.edu> wrote:

> Hey guys,
>
> My gem5 real simulation crashed with
> "
>
> gem5.opt: build_alter/build/X86/cpu/timebuf.hh:54: void
> TimeBuffer::valid(int) const [with T = DefaultIEWDefaultCommit]:
> Assertion `idx >= -past && idx <= future' failed.
> "
> I am using DerivO3CPU and 4MB l2 cache 32GB mem and ramulator memory.
> Restore a checkpoint which was token using KVM CPU (I have create multi
> event queues to make KVM run correctly). The workload was running 8
> instances of the same application with 2 threads each.
> More backtrace
> "
>
> 172928 --- BEGIN LIBC BACKTRACE ---
>
> 172929 build_alter/build/X86/gem5.opt(_Z15print_backtracev+
> 0x15)[0x1558a15]
>
> 172930 build_alter/build/X86/gem5.opt(_Z12abortHandleri+0x39)[0x156b039]
>
> 172931 /lib64/libpthread.so.0(+0xf100)[0x7f538bf88100]
>
> 172932 /lib64/libc.so.6(gsignal+0x37)[0x7f538a5405f7]
>
> 172933 /lib64/libc.so.6(abort+0x148)[0x7f538a541ce8]
>
> 172934 /lib64/libc.so.6(+0x2e566)[0x7f538a539566]
>
> 172935 /lib64/libc.so.6(+0x2e612)[0x7f538a539612]
>
> 172936 build_alter/build/X86/gem5.opt[0x906252]
>
> 172937 build_alter/build/X86/gem5.opt(_ZN10DefaultIEWI9O3CP
> UImplE12instToCommitER14RefCountingPtrI13BaseO3DynInstIS0_
> EE+0x1dd)[0x150c09d]
>
> 172938 build_alter/build/X86/gem5.opt(_ZN10DefaultIEWI9O3CP
> UImplE12executeInstsEv+0x560)[0x150ef40]
>
> 172939 build_alter/build/X86/gem5.opt(_ZN10DefaultIEWI9O3CP
> UImplE4tickEv+0x11e)[0x1513a9e]
>
> 172940 build_alter/build/X86/gem5.opt(_ZN9FullO3CPUI9O3CPUI
> mplE4tickEv+0x12b)[0x14e5a8b]
>
> 172941 build_alter/build/X86/gem5.opt(_ZN10EventQueue10serv
> iceOneEv+0xc5)[0x155f265]
>
> 172942 build_alter/build/X86/gem5.opt(_Z9doSimLoopP10EventQueue+0x40)
> [0x1576580]
>
> 172943 build_alter/build/X86/gem5.opt(_Z8simulatem+0xd4d)[0x157768d]
>
> 172944 build_alter/build/X86/gem5.opt[0xe9a07a]
>
> 172945 build_alter/build/X86/gem5.opt[0xb84bc7]
>
> 172946 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x50c2)
> [0x7f538b88d5d2]
>
> 172947 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]
>
> 172948 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)
> [0x7f538b88c76f]
>
> 172949 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]
>
> 172950 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)
> [0x7f538b88c76f]
>
> 172951 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]
>
> 172952 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)
> [0x7f538b88c76f]
>
> 172953 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]
>
> 172954 /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f538b88e1c2]
>
> 172955 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4f00)
> [0x7f538b88d410]
>
> 172956 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]
>
> 172957 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)
> [0x7f538b88c76f]
>
> 172958 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]
>
> 172959 /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f538b88e1c2]
>
> 172960 /lib64/libpython2.7.so.1.0(+0xfb5ff)[0x7f538b8a75ff]
>
> 172961 --- END LIBC BACKTRACE ---
> "
> any idea? this only happen during real simulation, and sometimes we can
> avoid it by running longer warmup in atomic mode.
>
> best,
> Da Zhang
>
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] gem5 crash with some cpu related error in real simulation after restore from a checkpoint

2018-05-04 Thread Da Zhang
Hey guys,

My gem5 real simulation crashed with
"

gem5.opt: build_alter/build/X86/cpu/timebuf.hh:54: void
TimeBuffer::valid(int) const [with T = DefaultIEWDefaultCommit]:
Assertion `idx >= -past && idx <= future' failed.
"
I am using DerivO3CPU and 4MB l2 cache 32GB mem and ramulator memory.
Restore a checkpoint which was token using KVM CPU (I have create multi
event queues to make KVM run correctly). The workload was running 8
instances of the same application with 2 threads each.
More backtrace
"

172928 --- BEGIN LIBC BACKTRACE ---

172929 build_alter/build/X86/gem5.opt(_Z15print_backtracev+0x15)[0x1558a15]

172930 build_alter/build/X86/gem5.opt(_Z12abortHandleri+0x39)[0x156b039]

172931 /lib64/libpthread.so.0(+0xf100)[0x7f538bf88100]

172932 /lib64/libc.so.6(gsignal+0x37)[0x7f538a5405f7]

172933 /lib64/libc.so.6(abort+0x148)[0x7f538a541ce8]

172934 /lib64/libc.so.6(+0x2e566)[0x7f538a539566]

172935 /lib64/libc.so.6(+0x2e612)[0x7f538a539612]

172936 build_alter/build/X86/gem5.opt[0x906252]

172937 build_alter/build/X86/gem5.opt(_ZN10DefaultIEWI9O3CPUImplE12in
stToCommitER14RefCountingPtrI13BaseO3DynInstIS0_EE+0x1dd)[0x150c09d]

172938 build_alter/build/X86/gem5.opt(_ZN10DefaultIEWI9O3CPUImplE12ex
ecuteInstsEv+0x560)[0x150ef40]

172939 build_alter/build/X86/gem5.opt(_ZN10DefaultIEWI9O3CPUImplE4tic
kEv+0x11e)[0x1513a9e]

172940 build_alter/build/X86/gem5.opt(_ZN9FullO3CPUI9O3CPUImplE4tickE
v+0x12b)[0x14e5a8b]

172941 build_alter/build/X86/gem5.opt(_ZN10EventQueue10serviceOneEv+
0xc5)[0x155f265]

172942 build_alter/build/X86/gem5.opt(_Z9doSimLoopP10EventQueue+
0x40)[0x1576580]

172943 build_alter/build/X86/gem5.opt(_Z8simulatem+0xd4d)[0x157768d]

172944 build_alter/build/X86/gem5.opt[0xe9a07a]

172945 build_alter/build/X86/gem5.opt[0xb84bc7]

172946 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x50c2)[0x7f538b88d5d2]

172947 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172948 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)[0x7f538b88c76f]

172949 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172950 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)[0x7f538b88c76f]

172951 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172952 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)[0x7f538b88c76f]

172953 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172954 /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f538b88e1c2]

172955 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4f00)[0x7f538b88d410]

172956 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172957 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)[0x7f538b88c76f]

172958 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172959 /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f538b88e1c2]

172960 /lib64/libpython2.7.so.1.0(+0xfb5ff)[0x7f538b8a75ff]

172961 --- END LIBC BACKTRACE ---
"
any idea? this only happen during real simulation, and sometimes we can
avoid it by running longer warmup in atomic mode.

best,
Da Zhang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] How to run a checkpoint on an old gem5

2018-05-02 Thread Da Zhang
Hi guys,

I have some new gem5 checkpoints made kvm. However, I want to run them with
an old version of gem5 (back to 2014). I know there is an updater, but how
to downgrade a checkpoint to run it on the old gem5?

Best,
Da Zhang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

[gem5-users] gem5 crash with some cpu related error in real simulation after restore from a checkpoint

2018-04-04 Thread Da Zhang
Hey guys,

My gem5 real simulation crashed with
"

gem5.opt: build_alter/build/X86/cpu/timebuf.hh:54: void
TimeBuffer::valid(int) const [with T =
DefaultIEWDefaultCommit]: Assertion `idx >= -past && idx <=
future' failed.
"
I am using DerivO3CPU and 32MB l2 cache 32GB mem and ramulator memory.
Restore a checkpoint who was token using KVM CPU. The workload was running
8 instances of the same application with 2 threads each.
More backtrace
"

172928 --- BEGIN LIBC BACKTRACE ---

172929 build_alter/build/X86/gem5.opt(_Z15print_backtracev+0x15)[0x1558a15]

172930 build_alter/build/X86/gem5.opt(_Z12abortHandleri+0x39)[0x156b039]

172931 /lib64/libpthread.so.0(+0xf100)[0x7f538bf88100]

172932 /lib64/libc.so.6(gsignal+0x37)[0x7f538a5405f7]

172933 /lib64/libc.so.6(abort+0x148)[0x7f538a541ce8]

172934 /lib64/libc.so.6(+0x2e566)[0x7f538a539566]

172935 /lib64/libc.so.6(+0x2e612)[0x7f538a539612]

172936 build_alter/build/X86/gem5.opt[0x906252]

172937
build_alter/build/X86/gem5.opt(_ZN10DefaultIEWI9O3CPUImplE12instToCommitER14RefCountingPtrI13BaseO3DynInstIS0_EE+0x1dd)[0x150c09d]

172938
build_alter/build/X86/gem5.opt(_ZN10DefaultIEWI9O3CPUImplE12executeInstsEv+0x560)[0x150ef40]

172939
build_alter/build/X86/gem5.opt(_ZN10DefaultIEWI9O3CPUImplE4tickEv+0x11e)[0x1513a9e]

172940
build_alter/build/X86/gem5.opt(_ZN9FullO3CPUI9O3CPUImplE4tickEv+0x12b)[0x14e5a8b]

172941
build_alter/build/X86/gem5.opt(_ZN10EventQueue10serviceOneEv+0xc5)[0x155f265]

172942
build_alter/build/X86/gem5.opt(_Z9doSimLoopP10EventQueue+0x40)[0x1576580]

172943 build_alter/build/X86/gem5.opt(_Z8simulatem+0xd4d)[0x157768d]

172944 build_alter/build/X86/gem5.opt[0xe9a07a]

172945 build_alter/build/X86/gem5.opt[0xb84bc7]

172946 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x50c2)[0x7f538b88d5d2]

172947 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172948 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)[0x7f538b88c76f]

172949 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172950 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)[0x7f538b88c76f]

172951 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172952 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)[0x7f538b88c76f]

172953 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172954 /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f538b88e1c2]

172955 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x4f00)[0x7f538b88d410]

172956 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172957 /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x425f)[0x7f538b88c76f]

172958 /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f538b88e0bd]

172959 /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f538b88e1c2]

172960 /lib64/libpython2.7.so.1.0(+0xfb5ff)[0x7f538b8a75ff]

172961 --- END LIBC BACKTRACE ---
"
any idea? this only happen during real simulation, and sometimes we can
avoid it by running longer warmup in atomic mode.

best,
Da Zhang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] running fs.py with X86KvmCPU failed

2018-02-15 Thread Da Zhang
Hi Jason,

Thanks a lot! Your scripts work!
I am able to scale up to 40 cores on a compute node with 20 physical cores
/ 40 logical cores with your scripts. So I think the fs.py doesn't do
multithreaded mode correctly for running with kvm cpu. Some of your code
are commented as "required" for running kvm, is that all I need to make
fs.py work?
Moreover, can I use --fast-forward combined with kvm? Our goal is to run a
program with kvm to an interesting point and then switch to a detailed cpu
for experiments. Can we specify the number of instructions to fast forward
(as with --fast-forward option)? So that we can avoid the long wait time
for things like initialization.

best,
Da Zhang



On Thu, Feb 15, 2018 at 12:17 PM, Jason Lowe-Power <ja...@lowepower.com>
wrote:

> Hi Da,
>
> You likely need to enable gem5's multithreaded mode to get many CPUs to
> boot correctly. I've had success with up to 32 cores on a 4-core 8-thread
> system. I'm not sure if fs.py automatically does this correctly or not. See
> my scripts here: https://github.com/jlpresearch/gem5/tree/jason/
> kvm-testing/configs/myconfigs (note: I haven't rebased in a couple of
> months).
>
> Cheers,
> Jason
>
> On Tue, Feb 13, 2018 at 7:52 AM Da Zhang <d...@vt.edu> wrote:
>
>> Hey Jason,
>>
>> The package works. However, I encountered performance issues with increased 
>> CPU
>> number. The performance was very great with up to 4 CPUs ("-n", I am
>> assuming "number of CPUs" equal to "number of cores" for testing
>> multithreading workload later). However, the system fails to boot (or maybe
>> it was just too slow) when I scale it >= 5 CPUs. Any ideas or suggestions?
>>
>> I am working on a node with 2 x Intel(R) Xeon(R) CPU E5-2470 v2 @
>> 2.40GHz, 10 physic cores each (and 40 logical cores in total). I only used
>> the fs.py script with --cpu-type=X86KvmCPU, --mem-size=8GB and -n 4.
>> Moreover, the ancient Linux kernel and image from gem5 website have no
>> performance problem with increased CPUs.
>>
>> best,
>> Da Zhang
>>
>> On Thu, Feb 8, 2018 at 3:50 PM, Da Zhang <d...@vt.edu> wrote:
>>
>>> Hi Jason
>>>
>>> The package works (I used the second one)! And it also works with the
>>> package you provided (https://gem5-review.googlesource.com/c/public/
>>> gem5/+/7301) in my another email to fix the keyboard and mouse issue
>>> for running later linux kernel and ubuntu. (However, there are some
>>> conflicts in this package, and it is a little tricky to merge them.)
>>>
>>> Now, I can run gem5 for linux kernel v4.8.13 and ubuntu 16.04.1 with
>>> kvm support. And the speedup is so amazing. It used to take me 20 ~ 30
>>> minutes to boot up the system without the kvm cpu. Now, it takes only
>>> several seconds!!!
>>>
>>> Thanks so much!
>>>
>>> best,
>>> Da
>>>
>>> On Thu, Feb 8, 2018 at 12:01 PM, Jason Lowe-Power <ja...@lowepower.com>
>>> wrote:
>>>
>>>> These patches "fix" the problem. However, they may not apply cleanly to
>>>> HEAD and they definitely are not cleanly implemented.
>>>>
>>>> https://gem5-review.googlesource.com/c/public/gem5/+/7362
>>>> https://gem5-review.googlesource.com/c/public/gem5/+/7361
>>>>
>>>> Cheers,
>>>> Jason
>>>>
>>>> On Wed, Feb 7, 2018 at 8:49 PM Da Zhang <d...@vt.edu> wrote:
>>>>
>>>>> I am trying to run fs.py with kvm support which might help speedup
>>>>> our simulation in full system mode. I find the cpu type X86KvmCPU which is
>>>>> a "kvm-based hardware virtualized cpu". But running fs.py failed with the
>>>>> error information:
>>>>>
>>>>> panic: KVM: Failed to enter virtualized mode (hw reason: 0x8021)
>>>>>
>>>>> Memory Usage: 2416600 KBytes
>>>>>
>>>>> Program aborted at tick 53418967500
>>>>>
>>>>> --- BEGIN LIBC BACKTRACE ---
>>>>>
>>>>> build/X86/gem5.fast(_Z15print_backtracev+0x1f)[0xaee60f]
>>>>>
>>>>> build/X86/gem5.fast(_Z12abortHandleri+0x34)[0xaee6f4]
>>>>>
>>>>> /lib64/libpthread.so.0(+0xf5e0)[0x7f5b9ac685e0]
>>>>>
>>>>> /lib64/libc.so.6(gsignal+0x37)[0x7f5b9901c1f7]
>>>>>
>>>>> /lib64/libc.so.6(abort+0x148)[0x7f5b9901d8e8]
>>>>>
>>>>> build/X86/gem5.fast[0x6627df

Re: [gem5-users] running fs.py with X86KvmCPU failed

2018-02-13 Thread Da Zhang
Hey Jason,

The package works.
However, I encountered performance issues with increased CPU number. The
performance was very great with up to 4 CPUs ("-n", I am assuming "number
of CPUs" equal to "number of cores" for testing multithreading workload
later). However, the system fails to boot (or maybe it was just too slow)
when I scale it >= 5 CPUs. Any ideas or suggestions?

I am working on a node with 2 x Intel(R) Xeon(R) CPU E5-2470 v2 @ 2.40GHz,
10 physic cores each (and 40 logical cores in total). I only used the fs.py
script with --cpu-type=X86KvmCPU, --mem-size=8GB and -n 4. Moreover, the
ancient Linux kernel and image from gem5 website have no performance
problem with increased CPUs.

best,
Da Zhang

On Thu, Feb 8, 2018 at 3:50 PM, Da Zhang <d...@vt.edu> wrote:

> Hi Jason
>
> The package works (I used the second one)! And it also works with the
> package you provided (https://gem5-review.googlesource.com/c/public/
> gem5/+/7301) in my another email to fix the keyboard and mouse issue for
> running later linux kernel and ubuntu. (However, there are some conflicts
> in this package, and it is a little tricky to merge them.)
>
> Now, I can run gem5 for linux kernel v4.8.13 and ubuntu 16.04.1 with
> kvm support. And the speedup is so amazing. It used to take me 20 ~ 30
> minutes to boot up the system without the kvm cpu. Now, it takes only
> several seconds!!!
>
> Thanks so much!
>
> best,
> Da
>
> On Thu, Feb 8, 2018 at 12:01 PM, Jason Lowe-Power <ja...@lowepower.com>
> wrote:
>
>> These patches "fix" the problem. However, they may not apply cleanly to
>> HEAD and they definitely are not cleanly implemented.
>>
>> https://gem5-review.googlesource.com/c/public/gem5/+/7362
>> https://gem5-review.googlesource.com/c/public/gem5/+/7361
>>
>> Cheers,
>> Jason
>>
>> On Wed, Feb 7, 2018 at 8:49 PM Da Zhang <d...@vt.edu> wrote:
>>
>>> I am trying to run fs.py with kvm support which might help speedup
>>> our simulation in full system mode. I find the cpu type X86KvmCPU which is
>>> a "kvm-based hardware virtualized cpu". But running fs.py failed with the
>>> error information:
>>>
>>> panic: KVM: Failed to enter virtualized mode (hw reason: 0x8021)
>>>
>>> Memory Usage: 2416600 KBytes
>>>
>>> Program aborted at tick 53418967500
>>>
>>> --- BEGIN LIBC BACKTRACE ---
>>>
>>> build/X86/gem5.fast(_Z15print_backtracev+0x1f)[0xaee60f]
>>>
>>> build/X86/gem5.fast(_Z12abortHandleri+0x34)[0xaee6f4]
>>>
>>> /lib64/libpthread.so.0(+0xf5e0)[0x7f5b9ac685e0]
>>>
>>> /lib64/libc.so.6(gsignal+0x37)[0x7f5b9901c1f7]
>>>
>>> /lib64/libc.so.6(abort+0x148)[0x7f5b9901d8e8]
>>>
>>> build/X86/gem5.fast[0x6627df]
>>>
>>> build/X86/gem5.fast[0x95d518]
>>>
>>> build/X86/gem5.fast(_ZN10BaseKvmCPU13handleKvmExitEv+0x249)[0xc10859]
>>>
>>> build/X86/gem5.fast[0xbade3c]
>>>
>>> build/X86/gem5.fast(_ZN10EventQueue10serviceOneEv+0x91)[0xad4fc1]
>>>
>>> build/X86/gem5.fast(_Z9doSimLoopP10EventQueue+0xa0)[0xb5b110]
>>>
>>> build/X86/gem5.fast(_Z8simulatem+0x1f3)[0xb5b563]
>>>
>>> build/X86/gem5.fast[0x93867d]
>>>
>>> build/X86/gem5.fast[0x939c65]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x730a)[0x7f5b9a56b0ca]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f5b9a56cefd]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x663c)[0x7f5b9a56a3fc]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f5b9a56a57d]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f5b9a56a57d]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f5b9a56cefd]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f5b9a56d002]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5513)[0x7f5b9a5692d3]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f5b9a56cefd]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x663c)[0x7f5b9a56a3fc]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f5b9a56cefd]
>>>
>>> /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f5b9a56d002]
>>>
>>> /lib64/libpython2.7.so.1.0(+0x10043f)[0x7f5b9a58643f]
>>>
>>> /lib64/libpython2.7.so.1.0(PyRun_StringFlags+0x65)[0x7f5b9a5872a5]
>>>
>>

[gem5-users] gem5 with X86KvmCPU performance decreases dramatically with increased cpu number

2018-02-11 Thread Da Zhang
Hey guys,

I am using X86KvmCPU to speed up my simulation for Linux kernel v4.8.13 and
ubuntu server 16.04.1. The performance was very great with up to 4 CPUs
("-n", I am assuming "number of CPUs" equal to "number of cores" for
testing multithreading workload later). However, the system fails to boot
(or maybe it was just too slow) when I scale it >= 5 CPUs. Any ideas or
suggestions?

I am working on a node with 2 x Intel(R) Xeon(R) CPU E5-2470 v2 @ 2.40GHz,
10 physic cores each (and 40 logical cores in total). I only used the fs.py
script with --cpu-type=X86KvmCPU, --mem-size=8GB and -n 4.

best,
Da Zhang
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] running fs.py with X86KvmCPU failed

2018-02-08 Thread Da Zhang
Hi Jason

The package works (I used the second one)! And it also works with the
package you provided (
https://gem5-review.googlesource.com/c/public/gem5/+/7301) in my another
email to fix the keyboard and mouse issue for running later linux kernel
and ubuntu. (However, there are some conflicts in this package, and it is a
little tricky to merge them.)

Now, I can run gem5 for linux kernel v4.8.13 and ubuntu 16.04.1 with
kvm support. And the speedup is so amazing. It used to take me 20 ~ 30
minutes to boot up the system without the kvm cpu. Now, it takes only
several seconds!!!

Thanks so much!

best,
Da

On Thu, Feb 8, 2018 at 12:01 PM, Jason Lowe-Power <ja...@lowepower.com>
wrote:

> These patches "fix" the problem. However, they may not apply cleanly to
> HEAD and they definitely are not cleanly implemented.
>
> https://gem5-review.googlesource.com/c/public/gem5/+/7362
> https://gem5-review.googlesource.com/c/public/gem5/+/7361
>
> Cheers,
> Jason
>
> On Wed, Feb 7, 2018 at 8:49 PM Da Zhang <d...@vt.edu> wrote:
>
>> I am trying to run fs.py with kvm support which might help speedup
>> our simulation in full system mode. I find the cpu type X86KvmCPU which is
>> a "kvm-based hardware virtualized cpu". But running fs.py failed with the
>> error information:
>>
>> panic: KVM: Failed to enter virtualized mode (hw reason: 0x8021)
>>
>> Memory Usage: 2416600 KBytes
>>
>> Program aborted at tick 53418967500
>>
>> --- BEGIN LIBC BACKTRACE ---
>>
>> build/X86/gem5.fast(_Z15print_backtracev+0x1f)[0xaee60f]
>>
>> build/X86/gem5.fast(_Z12abortHandleri+0x34)[0xaee6f4]
>>
>> /lib64/libpthread.so.0(+0xf5e0)[0x7f5b9ac685e0]
>>
>> /lib64/libc.so.6(gsignal+0x37)[0x7f5b9901c1f7]
>>
>> /lib64/libc.so.6(abort+0x148)[0x7f5b9901d8e8]
>>
>> build/X86/gem5.fast[0x6627df]
>>
>> build/X86/gem5.fast[0x95d518]
>>
>> build/X86/gem5.fast(_ZN10BaseKvmCPU13handleKvmExitEv+0x249)[0xc10859]
>>
>> build/X86/gem5.fast[0xbade3c]
>>
>> build/X86/gem5.fast(_ZN10EventQueue10serviceOneEv+0x91)[0xad4fc1]
>>
>> build/X86/gem5.fast(_Z9doSimLoopP10EventQueue+0xa0)[0xb5b110]
>>
>> build/X86/gem5.fast(_Z8simulatem+0x1f3)[0xb5b563]
>>
>> build/X86/gem5.fast[0x93867d]
>>
>> build/X86/gem5.fast[0x939c65]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x730a)[0x7f5b9a56b0ca]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f5b9a56cefd]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x663c)[0x7f5b9a56a3fc]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f5b9a56a57d]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x67bd)[0x7f5b9a56a57d]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f5b9a56cefd]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f5b9a56d002]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x5513)[0x7f5b9a5692d3]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f5b9a56cefd]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalFrameEx+0x663c)[0x7f5b9a56a3fc]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalCodeEx+0x7ed)[0x7f5b9a56cefd]
>>
>> /lib64/libpython2.7.so.1.0(PyEval_EvalCode+0x32)[0x7f5b9a56d002]
>>
>> /lib64/libpython2.7.so.1.0(+0x10043f)[0x7f5b9a58643f]
>>
>> /lib64/libpython2.7.so.1.0(PyRun_StringFlags+0x65)[0x7f5b9a5872a5]
>>
>> build/X86/gem5.fast(_Z6m5MainiPPc+0x5f)[0xad327f]
>>
>> build/X86/gem5.fast(main+0x33)[0x5ddf23]
>>
>> /lib64/libc.so.6(__libc_start_main+0xf5)[0x7f5b99008c05]
>>
>> build/X86/gem5.fast[0x5df8fc]
>>
>> --- END LIBC BACKTRACE ---
>>
>> Aborted (core dumped)
>>
>> the command is as simple as:
>>
>> build/X86/gem5.fast -d ~/tmp/output1/ configs/example/fs.py
>> --mem-size=2GB --disk-image=linux-x86.img --cpu-type=X86KvmCPU
>>
>> Any idea? thanks in advance.
>>
>> best,
>> Da
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] fail to run fs mode with linux kernel v4.8.13 and ubuntu image 16.04.1

2018-02-08 Thread Da Zhang
Hi Jason,

The batch you provided works! I was able to run linux kernel v4.8.13 with
ubuntu 16.04.1 with the latest gem5 applying the patch. Sorry that I am new
to use these patches, and didn't apply it in a correct way.
Thanks a lot.

best,
Da

On Fri, Feb 2, 2018 at 4:20 PM, Da Zhang <d...@vt.edu> wrote:

> I am able to run linux kernel v4.8.13 and ubuntu 16.04.1 (server) image by
> using an old git branch f881618. The patch generates errors when building
> the simulator.
>
> On Thu, Feb 1, 2018 at 1:37 PM, Da Zhang <d...@vt.edu> wrote:
>
>> Hi Jason,
>>
>> I was able to pass the problem by using a very old stable gem5 release
>> (stable_2015_09_03). However, the system stops at
>>
>> Welcome to Ubuntu 16.04.1 LTS!
>>
>> systemd[1]: Set hostname to .
>> There is no panic. And the booting progress is freeze at this moment
>> while I am expecting an interactive terminal via m5tern as running the
>> default ancient kernel.
>> Moreover, it seems that in the latest update, some of the scripts aren't
>> updated accordingly. I did a pull and update last night and running fs.py
>> gives me the following error
>>
>> File "", line 1, in 
>>
>>  File "/work/dragonstooth/daz3/gem5/src/python/m5/main.py", line 433, in
>> main
>>
>> exec filecode in scope
>>
>>  File "configs/example/fs.py", line 395, in 
>>
>> elif buildEnv['TARGET_ISA'] != "arm" and options.generate_dtb:
>>
>> AttributeError: Values instance has no attribute 'generate_dtb'
>> I will try the patch today. Thanks in advance.
>>
>> best,
>> Da
>>
>> On Thu, Feb 1, 2018 at 11:03 AM, Jason Lowe-Power <ja...@lowepower.com>
>> wrote:
>>
>>> Hi Da,
>>>
>>> I'm aware of this problem and have submitted a patch. Unfortunately, I
>>> haven't had time to fix the problem in the correct way (as detailed by Gabe
>>> in the review). This patch will get linux to boot. If you have the
>>> inclination, feel free to fix it the correct way!
>>>
>>> https://gem5-review.googlesource.com/c/public/gem5/+/7301
>>>
>>> Cheers,
>>> Jason
>>>
>>> On Wed, Jan 31, 2018 at 11:46 AM Da Zhang <d...@vt.edu> wrote:
>>>
>>>> Hi all,
>>>>
>>>> I have some problem with running configs/example/fs.py with
>>>> linux v4.8.13 and ubuntu image 16.04.1. I follow the steps on Jason's blog 
>>>> (
>>>> http://www.lowepower.com/jason/setting-up-gem5-full-system.html) to
>>>> build linux kernel binary and ubuntu image. And the booting aborted after
>>>> line "i8042: PNP: No PS/2 controller found. Probing ports directly.".
>>>> There is a reply on Jason's blog (at the bottom) mentioned the same issue
>>>> but it seems that they haven't fixed it yet.
>>>>
>>>> gem5 give the following message:
>>>>
>>>> info: Entering event queue @ 0.  Starting simulation...
>>>>
>>>> warn: instruction 'fninit' unimplemented
>>>>
>>>> warn: Don't know what interrupt to clear for console.
>>>>
>>>> warn: Tried to clear PCI interrupt 14
>>>>
>>>> warn: i8042 "Write output port" command not implemented.
>>>>
>>>> panic: Data written for unrecognized command 0xd1
>>>>
>>>> Memory Usage: 4664884 KBytes
>>>>
>>>> Program aborted at tick 1922165374500
>>>>
>>>> Aborted
>>>>
>>>> And I also attached the output config.ini and system.pc.com_1.device
>>>> with this email. Any suggestions are appreciated.
>>>>
>>>> best,
>>>> Da
>>>> ___
>>>> gem5-users mailing list
>>>> gem5-users@gem5.org
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>>
>>> ___
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>
>>
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] fail to run fs mode with linux kernel v4.8.13 and ubuntu image 16.04.1

2018-02-02 Thread Da Zhang
I am able to run linux kernel v4.8.13 and ubuntu 16.04.1 (server) image by
using an old git branch f881618. The patch generates errors when building
the simulator.

On Thu, Feb 1, 2018 at 1:37 PM, Da Zhang <d...@vt.edu> wrote:

> Hi Jason,
>
> I was able to pass the problem by using a very old stable gem5 release
> (stable_2015_09_03). However, the system stops at
>
> Welcome to Ubuntu 16.04.1 LTS!
>
> systemd[1]: Set hostname to .
> There is no panic. And the booting progress is freeze at this moment while
> I am expecting an interactive terminal via m5tern as running the default
> ancient kernel.
> Moreover, it seems that in the latest update, some of the scripts aren't
> updated accordingly. I did a pull and update last night and running fs.py
> gives me the following error
>
> File "", line 1, in 
>
>  File "/work/dragonstooth/daz3/gem5/src/python/m5/main.py", line 433, in
> main
>
> exec filecode in scope
>
>  File "configs/example/fs.py", line 395, in 
>
> elif buildEnv['TARGET_ISA'] != "arm" and options.generate_dtb:
>
> AttributeError: Values instance has no attribute 'generate_dtb'
> I will try the patch today. Thanks in advance.
>
> best,
> Da
>
> On Thu, Feb 1, 2018 at 11:03 AM, Jason Lowe-Power <ja...@lowepower.com>
> wrote:
>
>> Hi Da,
>>
>> I'm aware of this problem and have submitted a patch. Unfortunately, I
>> haven't had time to fix the problem in the correct way (as detailed by Gabe
>> in the review). This patch will get linux to boot. If you have the
>> inclination, feel free to fix it the correct way!
>>
>> https://gem5-review.googlesource.com/c/public/gem5/+/7301
>>
>> Cheers,
>> Jason
>>
>> On Wed, Jan 31, 2018 at 11:46 AM Da Zhang <d...@vt.edu> wrote:
>>
>>> Hi all,
>>>
>>> I have some problem with running configs/example/fs.py with
>>> linux v4.8.13 and ubuntu image 16.04.1. I follow the steps on Jason's blog (
>>> http://www.lowepower.com/jason/setting-up-gem5-full-system.html) to
>>> build linux kernel binary and ubuntu image. And the booting aborted after
>>> line "i8042: PNP: No PS/2 controller found. Probing ports directly.".
>>> There is a reply on Jason's blog (at the bottom) mentioned the same issue
>>> but it seems that they haven't fixed it yet.
>>>
>>> gem5 give the following message:
>>>
>>> info: Entering event queue @ 0.  Starting simulation...
>>>
>>> warn: instruction 'fninit' unimplemented
>>>
>>> warn: Don't know what interrupt to clear for console.
>>>
>>> warn: Tried to clear PCI interrupt 14
>>>
>>> warn: i8042 "Write output port" command not implemented.
>>>
>>> panic: Data written for unrecognized command 0xd1
>>>
>>> Memory Usage: 4664884 KBytes
>>>
>>> Program aborted at tick 1922165374500
>>>
>>> Aborted
>>>
>>> And I also attached the output config.ini and system.pc.com_1.device
>>> with this email. Any suggestions are appreciated.
>>>
>>> best,
>>> Da
>>> ___
>>> gem5-users mailing list
>>> gem5-users@gem5.org
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>>
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Re: [gem5-users] fail to run fs mode with linux kernel v4.8.13 and ubuntu image 16.04.1

2018-02-01 Thread Da Zhang
Hi Jason,

I was able to pass the problem by using a very old stable gem5 release
(stable_2015_09_03). However, the system stops at

Welcome to Ubuntu 16.04.1 LTS!

systemd[1]: Set hostname to .
There is no panic. And the booting progress is freeze at this moment while
I am expecting an interactive terminal via m5tern as running the default
ancient kernel.
Moreover, it seems that in the latest update, some of the scripts aren't
updated accordingly. I did a pull and update last night and running fs.py
gives me the following error

File "", line 1, in 

 File "/work/dragonstooth/daz3/gem5/src/python/m5/main.py", line 433, in
main

exec filecode in scope

 File "configs/example/fs.py", line 395, in 

elif buildEnv['TARGET_ISA'] != "arm" and options.generate_dtb:

AttributeError: Values instance has no attribute 'generate_dtb'
I will try the patch today. Thanks in advance.

best,
Da

On Thu, Feb 1, 2018 at 11:03 AM, Jason Lowe-Power <ja...@lowepower.com>
wrote:

> Hi Da,
>
> I'm aware of this problem and have submitted a patch. Unfortunately, I
> haven't had time to fix the problem in the correct way (as detailed by Gabe
> in the review). This patch will get linux to boot. If you have the
> inclination, feel free to fix it the correct way!
>
> https://gem5-review.googlesource.com/c/public/gem5/+/7301
>
> Cheers,
> Jason
>
> On Wed, Jan 31, 2018 at 11:46 AM Da Zhang <d...@vt.edu> wrote:
>
>> Hi all,
>>
>> I have some problem with running configs/example/fs.py with linux v4.8.13
>> and ubuntu image 16.04.1. I follow the steps on Jason's blog (
>> http://www.lowepower.com/jason/setting-up-gem5-full-system.html) to
>> build linux kernel binary and ubuntu image. And the booting aborted after
>> line "i8042: PNP: No PS/2 controller found. Probing ports directly.".
>> There is a reply on Jason's blog (at the bottom) mentioned the same issue
>> but it seems that they haven't fixed it yet.
>>
>> gem5 give the following message:
>>
>> info: Entering event queue @ 0.  Starting simulation...
>>
>> warn: instruction 'fninit' unimplemented
>>
>> warn: Don't know what interrupt to clear for console.
>>
>> warn: Tried to clear PCI interrupt 14
>>
>> warn: i8042 "Write output port" command not implemented.
>>
>> panic: Data written for unrecognized command 0xd1
>>
>> Memory Usage: 4664884 KBytes
>>
>> Program aborted at tick 1922165374500
>>
>> Aborted
>>
>> And I also attached the output config.ini and system.pc.com_1.device with
>> this email. Any suggestions are appreciated.
>>
>> best,
>> Da
>> ___
>> gem5-users mailing list
>> gem5-users@gem5.org
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
>
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users