[gem5-users] Modifying Branch Predictor
Hi all I want to experiment with different branch prediction on minor, ARM cpu. I would also like to disable the branch predictor if possible. Currently there are sophisticated branch predictors in gem5. Hence, I would alos like to experiment with simpler strategies. I would like to know how to go about modifying the branch predictors (and) disable them. Thanks a lot in advance! ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] m5ops with RISCV
Dear Alec Thanks a lot for your prompt reply. I looked at the CSR counters and it looks to me that I cannot find number of integer, load, store instructions executed separately. Ideally, if I can reset and dump stats using m5_dumpstats and m5_resetstats, I can get detailed information on instruction executed per type in one go. But now this is not available. Is there *any tutorial* that can be used for *implementing* this? Thanks a lot once again. V Vanchinathan On Wed, Dec 20, 2017 at 2:44 AM, Alec Roelke <ar...@virginia.edu> wrote: > Hi Vanchinathan, > > At the moment, there is not a patch for m5op support for RISC-V. If you > want binaries to have access to performance counts, the best way to do that > would probably be to add CSRs for them and have those CSRs return the > values of the corresponding stats when read, like INSTRET, CYCLE, and TIME > do. > > -Alec Roelke > > On Tue, Dec 19, 2017 at 1:25 AM, Vanchinathan Venkataramani < > dcsv...@gmail.com> wrote: > >> Dear all >> >> I would like to collect performance counters for a RISCV binary in gem5. >> >> Is there a util/m5 patch for generating m5ops for RISCV? Any help really >> appreciated. >> >> Best regards >> V Vanchinathan >> > > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] m5ops with RISCV
Dear all I would like to collect performance counters for a RISCV binary in gem5. Is there a util/m5 patch for generating m5ops for RISCV? Any help really appreciated. Best regards V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Running on RISCV SE mode
Dear Forum members I am trying to run hello world application compiled for RISCV on SE mode of gem5. I used this link: https://github.com/riscv/riscv-gnu-toolchain to obtain the cross compiler for riscv and used* /opt/riscv/bin/riscv64-unknown-elf-gcc -static -O2 hello.c -o hello* to compile the hello world program. When I use "file hello", I get the following output: hello: ELF 64-bit LSB executable, version 1 (SYSV), statically linked, not stripped When I try to run this executable on gem5 using the following command: ./build/RISCV_SE/gem5.opt ./configs/example/se.py -c /home/vanchi/srjkvr_riscv/benchmarks/hello --cpu-type=atomic -n 1 I get the error: fatal: Object file architecture does not match compiled ISA (RISCV). It would be really helpful if you can point me to the right version of cross compiler or the gem5 command line that can help me in running application compiled for riscv on gem5. Thanks a lot in advance! ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Implementation of stack in SE mode
Dear all I am trying to have a one to one mapping of stack variables in a function to a set of physical addresses in ARM, gem5, SE mode. For achieving this, I need to know the size and starting virtual address of stack in a function. >From my understanding, the following instruction at the function header has the stack size: sub sp, sp, #size Q1. If this is correct, I would like to know how to intercept this instruction and obtain the virtual address and size of the stack size that needs to be allocated. Additionally, in some cases, there are some are random padding instructions like: sub sp, sp, #4 below this instruction. Q2. What is the rationale behind these padding instructions? Any help really appreciated. Thanks a lot in advance. Best regards V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Virtual to Physical Address in ARMv8 FS Classic Memory
Hi all I am currently running an application on 64 core ARMv8 FS with Classic Memory with individual L1 D and I Cache and unified L2 cache. On looking at the cache memory trace, two virtual addresses, one from Kernel space (e.g. 0xffc071a63400) and one from application space (e.g. 0x915400) are mapped to the same physical address (e.g. 0xf1a63400) The *kernel memory access* occurs first and ends as a *cache miss*. However, the first access to the application memory address ends up as a *cache hit. *I double checked with the cache trace and statistics to confirm this. One explanation is that these belong to two different threads and hence can have the same physical address due to context switching. However, if that is the case, access to the application address should end up as a miss (which is not the case). Any explanation is greatly appreciated. Thanks a lot in advance. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Safety of SnoopMask Extension
Hello all I am trying to run 64 cores, ARMV8, FS with Classic Memory on gem5. I found that the current SnoopMask can only support 64 connections and hence cannot be used to model more than 32 cores or more with separate L1 I- and D-Cache. This forced me to change the data type from uint64_t to uint128_t and modify some ostream related functions. This change works fine for the tested benchmark, However, I would like to know if this safe. Thanks a lot in advance!. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] ARM FS Timing/Minor on latest gem5
Hi all I am trying to use the latest version of gem5 on ARMV8 FS with Linux kernel 4.3. I can successfully create checkpoint and restore for num_cpu =2 in atomic mode. However, when I restore with Minor/Timing CPU, the simulation keeps on running and never exits. If I understand correctly, there is something going wrong in lock related instructions as per the instruction execution trace. I would like to know if anyone encountered the same problem and solutions if any. My execution command is stated in [1] Thanks a lot in advance! [1] - ./build/ARM/gem5.opt -d m5out/hello/ configs/example/fs.py --kernel=/home/abc/linux-arm-gem5/vmlinux --machine-type=VExpress_GEM5_V1 --dtb-file=/home/abc/gem5/system/arm/dt/armv8_gem5_v1_2cpu.dtb --disk-image=aarch64-ubuntu-trusty-headless.img --num-cpu=2 --script=/home/abc/gem5/hello.rcS --mem-size=2GB --mem-type=SimpleMemory -r 1 --restore-with-cpu=timing ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Enable Printing in Linux 4.3
Hi Jason I cannot see the output in system.terminal file. I am using the latest gem5 version. Best regards V Vanchinathan On Thu, Jun 30, 2016 at 10:03 PM, Jason Lowe-Power <ja...@lowepower.com> wrote: > Hello, > > These are not printed to stdout in full system mode, but are saved in the > terminal file (something like system.pc.com_1.terminal). You can also see > the terminal output if you telnet into the guest system (e.g., using m5term > in gem5/util/term). > > Cheers, > Jason > > On Thu, Jun 30, 2016 at 4:48 AM Vanchinathan Venkataramani < > dcsv...@gmail.com> wrote: > >> Hi all >> >> I am currently running Linux 4.3 for ARM64. Any commands in rcS using >> echo or printf in programs do not get printed. >> >> Is there some flag in the kernel or gem5 that I need to set for enabling >> these prints? >> >> Any help is really appreciated. >> >> Best regards >> V Vanchinathan >> >> >> >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] ALPHA FS with one level cache
Hi all! I am trying to run ALPHA full system mode with one level of MESI coherent L1 cache. However, current MESI version of ALPHA Full System requires two level of caches. I want to know if it is possible to remove the L2 cache in ALPHA_MESI_Two_Level. Thanks ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Running Multiple Workloads in fs mode
Hi Did you get this working? Thanks On Fri, Aug 1, 2014 at 11:02 PM, Mohammadsadegh Sadeghi via gem5-users < gem5-users@gem5.org> wrote: > Hi everyone, > > How can I run multiple applications in fs mode? In fact, I want to run > more than one workload simultaneously on different CPUs. I think based on > my Linux Image which I should use some commands such as "schedtool" or > "taskset" in my_script.rcS before running applications of benchmark. But > they doesn't work. > Please help me what should I do > > Highly appreciate your help. > Mohammad Sadegh Sadeghi > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Task pinning and migration in ALPHA FS
Hi all I am trying to simulate task pinning and migration by modifying the rcS script which launches my benchmarks. I tried taskset -c but that is not supported in the kernel. Previous replies for a similar questions states export GOMP_CPU_AFFINITY="0 3" for setting the affinity. However, the statistics do not show a lot of variance. Hence, I would like to know if there is any other way to do this. Thanks a lot in advance. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] periodic stats for multi threaded benchmarks
Hi all I'm trying to find the periodic CPI for multi threaded parsec benchmark for every 10 million instructions executed. I added a new event to intercept and obtain the statistics. This approach works for single threaded benchmarks. However in case of parsec, the statistics dump is not happening periodically after 10 million instructions. Following is the code snippet I added in configs/common/Simulation.py: TEN_M = 1000 for i in xrange(1,450): for j in xrange(0,np): testsys.cpu[j].scheduleInstStop(0,i*TEN_M,dump statistics) if(options.cpu_type != 'atomic'): testsys.switch_cpus[j].scheduleInstStop(0,i*TEN_M,dump statistics) Thanks in advance. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Maximum ARM cores in FS mode
I fixed this problem by using the system files provided in the beginning of this thread and recompiling the dtb file to support 8 cores as stated in: https://www.mail-archive.com/gem5-users@gem5.org/msg11338.html Thanks On Mon, May 25, 2015 at 6:52 PM, Vanchinathan Venkataramani dcsv...@nus.edu.sg wrote: Hi Tony Is it possible to run 8 cores on 32 bit ARM? Thanks On Sat, May 23, 2015 at 2:24 AM, Gutierrez, Anthony anthony.gutier...@amd.com wrote: I thought I updated the system files to have support for 8 cores in aarch64. If you use aarch64 and the system files here: http://www.gem5.org/dist/current/arm/aarch-system-2014-10.tar.xz you may be able to use 8 cores. If not, you need to build an updated dtb file with support for 8 cores. I have one somewhere and I’ll put it up on the wiki, but I won’t be able to get to it today. -Tony *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Vanchinathan Venkataramani *Sent:* Friday, May 22, 2015 11:06 AM *To:* gem5 users mailing list *Subject:* [gem5-users] Maximum ARM cores in FS mode Hi all I found that the maximum number of cores in arm fs mode is four. What do I need to do enable 8 cores? Thanks a lot in advance. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Maximum ARM cores in FS mode
Hi Tony I tried the image you sent in the previous e-mail. However, the dtb file for 8 cores (binaries/vexpress.aarch32.ll_20131205.0-gem5.8cpu.dtb) is not present in it. Thanks On Mon, May 25, 2015 at 5:46 PM, Vanchinathan Venkataramani dcsv...@nus.edu.sg wrote: Hi Tony I tried the image you sent in the previous e-mail. However, the dtb file for 8 cores (binaries/vexpress.aarch32.ll_20131205.0-gem5.8cpu.dtb) is not present in it. On Sat, May 23, 2015 at 2:24 AM, Gutierrez, Anthony anthony.gutier...@amd.com wrote: I thought I updated the system files to have support for 8 cores in aarch64. If you use aarch64 and the system files here: http://www.gem5.org/dist/current/arm/aarch-system-2014-10.tar.xz you may be able to use 8 cores. If not, you need to build an updated dtb file with support for 8 cores. I have one somewhere and I’ll put it up on the wiki, but I won’t be able to get to it today. -Tony *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Vanchinathan Venkataramani *Sent:* Friday, May 22, 2015 11:06 AM *To:* gem5 users mailing list *Subject:* [gem5-users] Maximum ARM cores in FS mode Hi all I found that the maximum number of cores in arm fs mode is four. What do I need to do enable 8 cores? Thanks a lot in advance. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Maximum ARM cores in FS mode
Hi Tony Is it possible to run 8 cores on 32 bit ARM? Thanks On Mon, May 25, 2015 at 6:52 PM, Vanchinathan Venkataramani dcsv...@nus.edu.sg wrote: Hi Tony Is it possible to run 8 cores on 32 bit ARM? Thanks On Sat, May 23, 2015 at 2:24 AM, Gutierrez, Anthony anthony.gutier...@amd.com wrote: I thought I updated the system files to have support for 8 cores in aarch64. If you use aarch64 and the system files here: http://www.gem5.org/dist/current/arm/aarch-system-2014-10.tar.xz you may be able to use 8 cores. If not, you need to build an updated dtb file with support for 8 cores. I have one somewhere and I’ll put it up on the wiki, but I won’t be able to get to it today. -Tony *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Vanchinathan Venkataramani *Sent:* Friday, May 22, 2015 11:06 AM *To:* gem5 users mailing list *Subject:* [gem5-users] Maximum ARM cores in FS mode Hi all I found that the maximum number of cores in arm fs mode is four. What do I need to do enable 8 cores? Thanks a lot in advance. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Forwarding data from strd to ldrd
Hi Mitch Thanks a lot for your reply. In my case there are two stores (ready but not committed) which have all the data required by ldrd. So I'm not sure how I can merge the data from both these stores and forward it to the load. Thank you once again On Fri, Mar 13, 2015 at 11:31 AM, Vanchinathan Venkataramani dcsv...@nus.edu.sg wrote: Hi Mitch Thanks a lot for your reply. In my case there are two stores (ready but not committed) which have all the data required by ldrd. So I'm not sure how I can merge the data from both these stores and forward it to the load. Thank you once again On Thu, Mar 12, 2015 at 10:21 PM, Mitch Hayenga mitch.hayenga+g...@gmail.com wrote: Here's how o3 would work in this case. The relevant code is in src/cpu/o3/lsq_unit.hh (in the LSQUnit::read function) around line 640. The code in the backend explicitly works on micro-ops, so each load/store micro-op will get it's own LSQ entry. If both the ldrd and strd are cracked, then nothing special needs to happen. Each load micro-op will iterate through the store queue looking to see if any store has the required data. If however, we don't crack the ldrd, but do crack the strd... Then no store will have the full 8 bytes of data needed for the load to complete. Currently this will cause o3 to detect a partial forward from a store. The load is then not allowed to complete and will have to re-execute later once the store has committed. o3's LSQ does not assume it can merge multiple store entries together to satisfy a larger load. On Thu, Mar 12, 2015 at 5:05 AM, Andreas Hansson andreas.hans...@arm.com wrote: I must confess I am not too familiar with how the various CPUs accomplish this. Hopefully someone else is able to help. Andreas From: Vanchinathan Venkataramani dcsv...@gmail.com Reply-To: gem5 users mailing list gem5-users@gem5.org Date: Wednesday, 11 March 2015 09:22 To: gem5 users mailing list gem5-users@gem5.org Subject: [gem5-users] Forwarding data from strd to ldrd Hi Andreas I'm looking into strd and ldrd instruction on gem5. ldrd reads eight bytes of data into two registers, while strd writes the value from two registers into memory. In gem5, strd in divided into multiple micro instructions, each writing four bytes of data. A younger ldrd might have to get the data directly from two micro store instruction. It will be really helpful if you can provide some pointers on how ldrd is able to get the date from the older strd micro instructions in gem5. Thanks in advance. -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England Wales, Company No: 2548782 ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Forwarding data from strd to ldrd
Hi Andreas I'm looking into strd and ldrd instruction on gem5. ldrd reads eight bytes of data into two registers, while strd writes the value from two registers into memory. In gem5, strd in divided into multiple micro instructions, each writing four bytes of data. A younger ldrd might have to get the data directly from two micro store instruction. It will be really helpful if you can provide some pointers on how ldrd is able to get the date from the older strd micro instructions in gem5. Thanks in advance. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] List of SPEC 2000 runnable benchmarks on gem5
Hi Andreas I can find the list of SPEC2006 benchmarks that can successfully run on gem5 from the following link: http://www.m5sim.org/SPEC_CPU2006_benchmarks I would like to know if I can find similar details for SPEC2000 benchmarks also. Thanks ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] List of SPEC 2000 runnable benchmarks on gem5
I wanted to get information on whether a particular benchmark can run on gem5, the host execution seconds, number of instructions, etc. On Tue, Mar 10, 2015 at 3:34 PM, Mahmood Naderan mahmood...@gmail.com wrote: The instructions are available at http://www.m5sim.org/SPEC2000_benchmarks On 3/10/15, Vanchinathan Venkataramani dcsv...@gmail.com wrote: Hi Andreas I can find the list of SPEC2006 benchmarks that can successfully run on gem5 from the following link: http://www.m5sim.org/SPEC_CPU2006_benchmarks I would like to know if I can find similar details for SPEC2000 benchmarks also. Thanks -- Regards, Mahmood ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Running LBM benchmark on ARM gem5 SE mode
I'm trying to run two instances of lbm benchmark on two individual cores. However, I'm getting the following fatal error Out of memory, please increase size of physical memory. Has anyone faced this problem before? Thanks in advance. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] simObject clocks and global simulation clock
Ticks is the basic unit of time in gem5. gem5 uses this for synchronization. Every system cycle is made up of n number of ticks ticks. For a system unit with frequency = 2 GHZ, one cycle = 500 ticks On Tue, Dec 16, 2014 at 7:04 PM, Anny via gem5-users gem5-users@gem5.org wrote: Hi all, I have a question about clocks on gem5. In gem5, it seems that there is a global simulation clock and every simObject has a clock domain. The eventq is sorted in time. When two objects with two different clocks schedule two events on eventq, how the order is determined since the two objects have different clocks? Are all objects synchronious? it seems that everything in the system is based of one clock (global simulation clock)? It is binding. Best, Anny. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Waking up instructions dependent on load
Hi I am working with ARM SE mode. Suppose we have: ldr r4,[r2,#8] add r1 r4,r1 It looks to me that the dependent add instruction is woken up only after load instruction is written back to memory. Also write back of load instruction happens after commit. I would like to know if this is correct. Thanks ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Re-Executing LAS Conflicts
Hi Andreas and Arthur It would be really helpful if you can provide some hints. Thanks! On Mon, Dec 1, 2014 at 10:56 PM, Vanchinathan Venkataramani dcsv...@gmail.com wrote: Hi Arthur Thanks a lot for your reply. Your interpretation of LAS is what I require. I want to replay execution starting from the Load. It will be really helpful if you can give me hints on how to replay execution from this load instruction. Thanks On Mon, Dec 1, 2014 at 10:54 PM, Vanchinathan Venkataramani dcsv...@nus.edu.sg wrote: Hi Arthur Thanks a lot for your reply. Your interpretation of LAS is what I require. I want to replay execution starting from the Load. It will be really helpful if you can give me hints on how to replay execution from this load instruction. Thanks On Mon, Dec 1, 2014 at 10:01 PM, Arthur Perais arthur.per...@inria.fr wrote: Okay, the next comments assume that you are talking about a load that executed before an older store writing to the same address executed, and therefore got the wrong value. If what you call LAS refers to something else, disregard that. From what I gathered, the only replay mechanism currently implemented in the o3 CPU is there to deal with partial matches with store-to-load forwarding. For instance, when a load needs data that is part written by a store, and part in the dcache. In that case, the instruction is replayed when the store writes to the dcache (the mechanism is actually coarser than that but you get the idea). If you want selective replay for memory order violation (which is okay but quite complex in my opinion), you need to implement it yourself. This entails : - Getting all the instructions you need to replay (through register dependencies and memory dependencies). - Restore their state (clear the Issued flag, clear the Executed flag, and so on). - Restore dependencies which is non trivial since wakeDependents in inst_queue_impl.hh clears dependencies in dep_graph.hh when waking up insts. This means that you need to retain dependencies even after instructions have issued. You also need to deal with memory dependencies. - How do you replay? From the IQ? if so, then you can't free the IQ entry upon issue. If not, then you need a particular buffer to replay instructions from. If you want non-selective replay, this should be easier, although dependencies still have to be restored and you have to deal with the question of where the instructions are replayed from. Hope this helps, and if anyone sees a gross mistake in what I said, do not hesitate. Le 01/12/2014 14:47, Vanchinathan Venkataramani via gem5-users a écrit : Hi Andreas In ARM O3CPU, when there is a load after store violation, the younger instructions are being squashed and re-fetched again. Is it possible to re-execute these instructions instead of squashing all the younger instructions? Thanks ___ gem5-users mailing listgem5-users@gem5.orghttp://m5sim.org/cgi-bin/mailman/listinfo/gem5-users -- Arthur Perais INRIA Bretagne Atlantique Bâtiment 12E, Bureau E303, Campus de Beaulieu 35042 Rennes, France ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Re-Executing LAS Conflicts
Hi Andreas In ARM O3CPU, when there is a load after store violation, the younger instructions are being squashed and re-fetched again. Is it possible to re-execute these instructions instead of squashing all the younger instructions? Thanks ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Dumping stats every N cycles
I want to dump the counter statistics every N cycles into a file. I saw some old posts on using periodicStatDump. I tried calling periodicStatDump(N) in se.py. This makes the simulation halt and the stats is never updated. I would like to know if I'm doing something wrong. Thanks ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Computing CPI Stack
Hi all I would like to know how I can build up the CPI stack from gem5 statistics. Thanks in advance ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Heterogeneous system in FS mode
Hi all I'm trying to create a heterogeneous system with 1 2-way 2 4-way and 1 8-way core in arm-detailed FS mode. Following is the change I made in fs.py: cpus = [] for i in xrange(4): if(i == 0): cpus.append(2-way) elif (i == 1): cpus.append(4-way) elif (i == 2): cpus.append(4-way) elif (i == 3): cpus.append(4-way) test_sys.cpu = cpus However when I'm trying to create a check point, I'm getting the following error: break event panic triggered Is there something I'm missing? Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Writing to float Register File
Hi all I'm trying to understand floating point register files. There are two functions responsible for reading and writing floating pointregisters. 1. setFloatReg 2. setFloatRegBits I'm not able to understand under what scenarios each of these functions are called. Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Running Assembly code on ARM SE mode
Hi all Is it possible to write my own assembly code containing around 10-15 instructions and run it on gem5 without converting it into a binary with header and footer code? Thanks in advance. V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] svc Instruction
At the end of svc instruction, R0 is being updated. On what case is R0 changed? Thanks V Vanchinathan On Thu, Aug 28, 2014 at 2:20 PM, Vanchinathan Venkataramani dcsv...@nus.edu.sg wrote: At the end of svc instruction, R0 is being updated. On what case is R0 changed? Thanks V Vanchinathan On Wed, Aug 27, 2014 at 5:23 PM, Vanchinathan Venkataramani dcsv...@gmail.com wrote: Hi all I would like to know how svc (trap) instruction is implemented in gem5. In particular I want to know on what condition the register values are changed when svc syscall is present. Thanks a lot! ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] svc Instruction
Hi all I would like to know how svc (trap) instruction is implemented in gem5. In particular I want to know on what condition the register values are changed when svc syscall is present. Thanks a lot! ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Having multi-ported L1 Cache
I would like to know if it is possible to connect the dcache_port of a CPU to separate ports on L1 Cache in classic memory model. Currently L1 cache has a single cpu_side port. I wanted to know if the functionality will be correct if I change this to a vector instead and connect each of the CPU dcache_port to this. Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Having multi-ported L1 Cache
Thanks a lot for your prompt reply. I'm currently having L1 shared between two cores. The dcache_port on each CPU is connected to a bus which is in turn connected to L1 D Cache. My understanding was that there can't be memory accesses from both the CPU's in the same cycle (even if it belonged to two different cache lines). However, if the L1 D cache has one dedicated port for each CPU, then the accesses can be parallelized. Thanks a lot! V Vanchinathan On Mon, Jul 21, 2014 at 8:38 PM, Andreas Hansson andreas.hans...@arm.com wrote: Hi, It is not obvious what you are trying to achieve here. Could you shed some more details on what you are after? Are you looking for more bandwidth to the L1 (if is already infinite)? Are you looking to have more outstanding transactions (it is already a parameter)? Are you looking to share the L1 between to cores (if so use a bus/crossbar)? Andreas From: Vanchinathan Venkataramani via gem5-users gem5-users@gem5.org Reply-To: Vanchinathan Venkataramani dcsv...@gmail.com, gem5 users mailing list gem5-users@gem5.org Date: Monday, 21 July 2014 12:16 To: gem5 users mailing list gem5-users@gem5.org Subject: [gem5-users] Having multi-ported L1 Cache I would like to know if it is possible to connect the dcache_port of a CPU to separate ports on L1 Cache in classic memory model. Currently L1 cache has a single cpu_side port. I wanted to know if the functionality will be correct if I change this to a vector instead and connect each of the CPU dcache_port to this. Thanks V Vanchinathan -- IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. ARM Limited, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England Wales, Company No: 2557590 ARM Holdings plc, Registered office 110 Fulbourn Road, Cambridge CB1 9NJ, Registered in England Wales, Company No: 2548782 ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Wrong decoding of instructions
I'm trying to execute a binary on ARM gem5 O3CPU model. I made some modification to the code. Now, some instructions wrongly get encoded as four micro-ops. I'm not sure how I should go about for debugging this problem. Any help is really appreciated. Thanks a lot! ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Wrong decoding of instructions
I was able to figure out that wrong encoding of instructions was happening using the Exec trace, My thought was that changing the pipeline code of gem5, will not affect fetching of instructions. Hence, I was not sure on how I need to go about to find the bug. Thanks V Vanchinathan On Mon, Jul 21, 2014 at 12:44 AM, Ali Saidi sa...@umich.edu wrote: Why don’t you compare the un-modified trace (with —debug-flags=Exec) to the modified one? Ali On Jul 20, 2014, at 11:06 AM, Vanchinathan Venkataramani via gem5-users gem5-users@gem5.org wrote: I'm trying to execute a binary on ARM gem5 O3CPU model. I made some modification to the code. Now, some instructions wrongly get encoded as four micro-ops. I'm not sure how I should go about for debugging this problem. Any help is really appreciated. Thanks a lot! ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Shared LSQ
Hi all I am trying to sharing the LSQ between all cores in O3CPU. After reading python and C++ files, I understand that LSQ is part of dcacheport and IEW stage of a CPU. Also lsq need to be associated with a CPU. Using the same lsq pointer doesn't work. Is there any other way to solve this problem? Any help is highly appreciated. Thanks a lot Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Task migration of an application
I am trying to migrate a task from one CPU to another on ARM SE mode. I understand that a lot of attributes need to be copied from one CPU to another. Is there a check list of things that I need to copy? Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Reading wrong data from Cache
Since 10 is the last value written to address A, it has to read 10 from address A. However it reads value 20 from Address A On Mon, May 5, 2014 at 11:56 AM, GE ZHIGUO via gem5-users gem5-users@gem5.org wrote: What did you mean by no data written to address A? *From:* gem5-users [mailto:gem5-users-boun...@gem5.org] *On Behalf Of *Vanchinathan Venkataramani via gem5-users *Sent:* Sunday, May 04, 2014 12:06 AM *To:* gem5 users mailing list *Subject:* [gem5-users] Reading wrong data from Cache Hi all I'm looking into Exec flag trace on arm_detalied on se mode. Let A be a virtual address. *Instruction 1* *writes *value *10* to *address A* *Instruction 50* *reads *value *20* from * address A* However there is *no data written* to address A *between *Instruction *1 and 50*. Why is this happening? Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Sharing LoadStoreQueue
Hi all I am trying to share the load store queue across CPUs in gem5. Since loadStoreQueue is an object in IEW stage, I'm not sure how it can be shared across CPUs. I would like to know if there is a way to do this. Thanks a lot in advance. V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Macro Ops splitting and Register 33
Thanks Martin for your reply. Assume that we have an instruction the tries to modify float register f0. In decode stage, this f0 is modified as 0 + FP_Base_DepTag. If FP_Base_DepTag = 1344 say, f0 will be decoded as 1344. Now in rename stage this register index is first checked to know whether it is INT, FP or Miscellaneous. For finding this they use FP_Base_DepTag. After finding this, the flattened index is found. Later they try to get the physical index corresponding to this flattened index. This physical index is used for reading the register value for example in the readFloatReg function in cpu.cc I hope this understanding is correct. It is important to note that 1416 is the decoded index for CPSR register and 33 is the decoded index for Zero register. Thanks Vanchinathan V On Tue, Jan 14, 2014 at 6:40 AM, Martin Brown mbr...@cs.fsu.edu wrote: Mitch, he may be trying to get the correct register index, but is getting values like 33 and 1416 instead. Am I right V? I sort of tried to answer this in your other post http://www.mail-archive.com/gem5-users@gem5.org/msg09046.html. I got around it by calling BaseSimpleCPU::getFlatRegIdex(regIdx). I think I did that because I saw it being done elsewhere, as I was trying to get the logical register index. Can anyone verify that this is the right way? On Fri, Jan 3, 2014 at 8:15 AM, Mitch Hayenga mitch.hayenga+g...@gmail.com wrote: In general control registers are not often renamed (regardless of ISA - ARM, x86, etc). If any instruction modifies the CPSR, it should be marked as serializing. This ensures that any writes to the register will be properly seen by any younger instructions. It works by not letting any younger instructions enter the processor back-end until the CPSR write completes. These control registers tend to be mostly read-only, so tolerating a pipeline stall on writes is not unreasonable. So, it likely is being used, it's just not renamed. On Fri, Jan 3, 2014 at 1:41 AM, Vanchinathan Venkataramani dcsv...@gmail.com wrote: Thanks a lot for your reply. Similar to R33, R37-R39 and so on there is register with id 1416 which exists as a source for many instructions. I found that this instruction is the CPSR register. I tried looking at the rename and iew stages for a given instruction. It turns out that this register is being accessed in rename stage but not in iew stage. Does this mean that the CPSR is not required for these instructions? Thanks V Vanchinathan On Thu, Jan 2, 2014 at 10:03 PM, Mitch Hayenga mitch.hayenga+g...@gmail.com wrote: R33 is a zero register. It is used whenever a zero is required. It is also often sourced unnecessarily if an instruction requires fewer source registers. In gem5 the basis for splitting is solely up to whoever wrote the ISA decoder. For arm its mostly what you would expect 2 real sources (not counting flags that can add more. R37-39 or so are the flags). In the past I tried to find a way to compile without predication too. Never found a way. Hope this helps. Sent from my phone. On Jan 2, 2014 7:38 AM, Vanchinathan Venkataramani dcsv...@gmail.com wrote: Hi all I tried to print the list of source registers in a given assembly instruction during the decode stage for ARM ISA. It turns out that many instructions have Register 33 as one of the source registers. I would like to know what this Register 33 signifies. Also I would like to know on what basis a macroop is split into microops since they introduce a lot of temporary registers. Is there a way to disable predication in ARM ISA while compiling? Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users -- Martin ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Accessing cpu from system
I solved this by using naive void pointer concepts. On Tue, Jan 7, 2014 at 5:59 PM, Vanchinathan Venkataramani dcsv...@gmail.com wrote: Hi I would like to know if there is a way to access cpu objects from System class. The python System declaration in se.py has cpu as a parameter. However, this value is not being used inside system.hh. Thanks a lot. V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Macro Ops splitting and Register 33
Hi all I tried to print the list of source registers in a given assembly instruction during the decode stage for ARM ISA. It turns out that many instructions have Register 33 as one of the source registers. I would like to know what this Register 33 signifies. Also I would like to know on what basis a macroop is split into microops since they introduce a lot of temporary registers. Is there a way to disable predication in ARM ISA while compiling? Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Macro Ops splitting and Register 33
Thanks a lot for your reply. Similar to R33, R37-R39 and so on there is register with id 1416 which exists as a source for many instructions. I found that this instruction is the CPSR register. I tried looking at the rename and iew stages for a given instruction. It turns out that this register is being accessed in rename stage but not in iew stage. Does this mean that the CPSR is not required for these instructions? Thanks V Vanchinathan On Thu, Jan 2, 2014 at 10:03 PM, Mitch Hayenga mitch.hayenga+g...@gmail.com wrote: R33 is a zero register. It is used whenever a zero is required. It is also often sourced unnecessarily if an instruction requires fewer source registers. In gem5 the basis for splitting is solely up to whoever wrote the ISA decoder. For arm its mostly what you would expect 2 real sources (not counting flags that can add more. R37-39 or so are the flags). In the past I tried to find a way to compile without predication too. Never found a way. Hope this helps. Sent from my phone. On Jan 2, 2014 7:38 AM, Vanchinathan Venkataramani dcsv...@gmail.com wrote: Hi all I tried to print the list of source registers in a given assembly instruction during the decode stage for ARM ISA. It turns out that many instructions have Register 33 as one of the source registers. I would like to know what this Register 33 signifies. Also I would like to know on what basis a macroop is split into microops since they introduce a lot of temporary registers. Is there a way to disable predication in ARM ISA while compiling? Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Register File Reading
Hi all I am trying to understand how register file reading works in gem5 for ARM_Detailled. I found that in IEW stage inside executeInsts function, inst-execute(); is being called. I looked up for this function and found a comment saying that it is auto-generated. In cpu.cc, the function readIntRegs accesses the register file with logical indexing. I would like to know where the physical to logical register indexing is done. Thanks V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Cache Trace For Multi Threaded Benchmarks
Hi all I am trying to get the cache trace for a multi threaded benchmark in gem5 for ARM ISA. I included a new debug flag in access method in src/mem/cache/cache_impl.hh to get the cache accesses. In addition I am trying to get the ID of the CPU that sent this access. From cache I can access only system and not cpu. I would like to know if there is a way to get this done. Thanks a lot in advance. V Vanchinathan ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Fwd: Disabling branch predictor
I would like to know if there is a way to disable branch predictor in O3 CPU model for ARM. Thanks V Vanchi ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users