Re: [gem5-users] Micro-op Data Dependency
I don't know that off the top of my head---the ISAs I'm familiar with are either not microcoded, or use a micro-op assembler to generate all the micro-ops (i.e., x86). Have you looked at how ARM micro-ops are constructed? That's the one ISA that I believe is mostly not microcoded but still has some microcode in it. Though come to think of it, it may be as easy as just using a constant where the other operands specify the machine code bitfield, if there's syntax that allows that. Steve On Tue, Aug 2, 2016 at 1:54 PM Alec Roelkewrote: > Okay, thanks. How do I tell the ISA parser that the 'Rt' operand I've > created refers to the extra architectural register? Or is there some > function I can call inside the instruction's code that writes directly to > an architectural register? All I can see from the code GEM5 generates is > "setIntRegOperand," which takes indices into _destRegIdx rather than > register indices. > > On Mon, Aug 1, 2016 at 10:58 AM, Steve Reinhardt wrote: > >> You don't need to worry about the size of the bitfield in the instruction >> encoding, because the temporary register(s) will never be directly >> addressed by any machine instruction. You should define a new >> architectural register using an index that doesn't appear in any >> instruction (e.g., if the ISA includes r0 to r31, then the temp reg can be >> r32). This register will get renamed in the O3 model. >> >> Steve >> >> >> On Sun, Jul 31, 2016 at 7:21 AM Alec Roelke wrote: >> >>> That makes sense. Would it be enough for me to just create a new IntReg >>> operand, like this: >>> >>> 'Rt': ('IntReg', 'ud', None, 'IsInteger', 4) >>> >>> and then increase the number of integer registers? The other integer >>> operands have a bit field from the instruction bits, but since the ISA >>> doesn't specify that these RMW instructions should be microcoded, there's >>> no way to decode a temporary register from the instruction bits. Will GEM5 >>> understand that and pick any integer register that's available? >>> >>> The memory address is taken from Rs1 before the load micro-op, and then >>> stored in a C++ variable for the remainder of the instruction. That was >>> done to ensure that other intervening instructions that might get executed >>> in the O3 model don't change Rs1 between the load and modify-write >>> micro-ops, but if I can get the temp register to work then that might fix >>> itself. >>> >>> I was only setting _srcRegIdx and _destRegIdx for disassembly reasons; >>> since the macro-op and first micro-op don't make use of Rs2, the >>> instruction wasn't setting _srcRegIdx[1] and the disassembly would show >>> something like 4294967295. Then it presented a potential solution to the >>> minor CPU model problem I described before. >>> >>> No, most of the ISA is not microcoded. In fact, as I said, these RMW >>> instructions are not specified to be microcoded by the ISA, but since they >>> each have two memory transactions they didn't appear to work unless I split >>> them into two micro-ops. >>> >>> On Sat, Jul 30, 2016 at 2:14 PM, Steve Reinhardt >>> wrote: >>> You shouldn't be passing values between micro-ops using C++ variables, you should pass the data in a register. (If necessary, create microcode-only temporary registers for this purpose, like x86 does.) This is microarchitectural state so you can't hide it from the CPU model. The main problem here is that, since this "hidden" data dependency isn't visible to the CPU model, it doesn't know that the micro-ops must be executed in order. If you pass that data in a register, the pipeline model will enforce the dependency. Also, where do you set the address for the memory accesses? Again, both micro-ops should read that out of a register, it should not be passed implicitly via hidden variables. You shouldn't have to explicitly set the internal fields like _srcRegIdx and _destRegIdx, the ISA parser should do that for you. Unfortunately the ISA description system wasn't originally designed to support microcode, and that support was kind of shoehorned in after the fact, so it is a little messy. Is your whole ISA microcoded, or just a few specific instructions? Steve On Fri, Jul 29, 2016 at 7:37 PM Alec Roelke wrote: > Sure, I can show some code snippets. First, here is the code for the > read micro-op for an atomic read-add-write: > > temp = Mem_sd; > > And the modify-write micro-op: > > Rd_sd = temp; > Mem_sd = Rs2_sd + temp; > > The memory address comes from Rs1. The variable "temp" is a temporary > location shared between the read and modify-write micro-ops (the address > from Rs1 is shared similarly to ensure it's the same when the instructions > are issued). > > In the
Re: [gem5-users] Micro-op Data Dependency
Okay, thanks. How do I tell the ISA parser that the 'Rt' operand I've created refers to the extra architectural register? Or is there some function I can call inside the instruction's code that writes directly to an architectural register? All I can see from the code GEM5 generates is "setIntRegOperand," which takes indices into _destRegIdx rather than register indices. On Mon, Aug 1, 2016 at 10:58 AM, Steve Reinhardtwrote: > You don't need to worry about the size of the bitfield in the instruction > encoding, because the temporary register(s) will never be directly > addressed by any machine instruction. You should define a new > architectural register using an index that doesn't appear in any > instruction (e.g., if the ISA includes r0 to r31, then the temp reg can be > r32). This register will get renamed in the O3 model. > > Steve > > > On Sun, Jul 31, 2016 at 7:21 AM Alec Roelke wrote: > >> That makes sense. Would it be enough for me to just create a new IntReg >> operand, like this: >> >> 'Rt': ('IntReg', 'ud', None, 'IsInteger', 4) >> >> and then increase the number of integer registers? The other integer >> operands have a bit field from the instruction bits, but since the ISA >> doesn't specify that these RMW instructions should be microcoded, there's >> no way to decode a temporary register from the instruction bits. Will GEM5 >> understand that and pick any integer register that's available? >> >> The memory address is taken from Rs1 before the load micro-op, and then >> stored in a C++ variable for the remainder of the instruction. That was >> done to ensure that other intervening instructions that might get executed >> in the O3 model don't change Rs1 between the load and modify-write >> micro-ops, but if I can get the temp register to work then that might fix >> itself. >> >> I was only setting _srcRegIdx and _destRegIdx for disassembly reasons; >> since the macro-op and first micro-op don't make use of Rs2, the >> instruction wasn't setting _srcRegIdx[1] and the disassembly would show >> something like 4294967295. Then it presented a potential solution to the >> minor CPU model problem I described before. >> >> No, most of the ISA is not microcoded. In fact, as I said, these RMW >> instructions are not specified to be microcoded by the ISA, but since they >> each have two memory transactions they didn't appear to work unless I split >> them into two micro-ops. >> >> On Sat, Jul 30, 2016 at 2:14 PM, Steve Reinhardt >> wrote: >> >>> You shouldn't be passing values between micro-ops using C++ variables, >>> you should pass the data in a register. (If necessary, create >>> microcode-only temporary registers for this purpose, like x86 does.) This >>> is microarchitectural state so you can't hide it from the CPU model. The >>> main problem here is that, since this "hidden" data dependency isn't >>> visible to the CPU model, it doesn't know that the micro-ops must be >>> executed in order. If you pass that data in a register, the pipeline model >>> will enforce the dependency. >>> >>> Also, where do you set the address for the memory accesses? Again, both >>> micro-ops should read that out of a register, it should not be passed >>> implicitly via hidden variables. >>> >>> You shouldn't have to explicitly set the internal fields like _srcRegIdx >>> and _destRegIdx, the ISA parser should do that for you. >>> >>> Unfortunately the ISA description system wasn't originally designed to >>> support microcode, and that support was kind of shoehorned in after the >>> fact, so it is a little messy. Is your whole ISA microcoded, or just a few >>> specific instructions? >>> >>> Steve >>> >>> >>> On Fri, Jul 29, 2016 at 7:37 PM Alec Roelke wrote: >>> Sure, I can show some code snippets. First, here is the code for the read micro-op for an atomic read-add-write: temp = Mem_sd; And the modify-write micro-op: Rd_sd = temp; Mem_sd = Rs2_sd + temp; The memory address comes from Rs1. The variable "temp" is a temporary location shared between the read and modify-write micro-ops (the address from Rs1 is shared similarly to ensure it's the same when the instructions are issued). In the constructor for the macro-op, I've included some code that explicitly sets the src and dest register indices so that they are displayed properly for execution traces: _numSrcRegs = 2; _srcRegIdx[0] = RS1; _srcRegIdx[1] = RS2; _numDestRegs = 1; _destRegIdx[0] = RD; So far, this works for the O3 model. But, in the minor model, it tries to execute the modify-write micro-op before the read micro-op is executed. The address is never loaded from Rs1, and so a segmentation fault often occurs. To try to fix it, I added this code to the constructors of each of the two micro-ops:
Re: [gem5-users] Physmem in SE Mode (Jason Lowe-Power)
Hello Jason, Thanks for your reply. I have few questions needs to be answered: - I looked into the Status Matrix (http://www.m5sim.org/Status_Matrix), and it says the Memory System for SPARC does not work with InOrder (or MinorCPU) CPU. So in that case how will I be able to run anything using the MinorCPU for SPARC. I had tried compiling the SPARC with MinorCPU option, but it didn’t work when I tried to run the helloworld program. Basically it said unavailable option. - Next, I looked into the “To Do List” for InOrder CPU, and it says SPARC is partially implemented, what does it mean? - For Threading in SPARC, is it same as running multiple workloads in multiple CPUs in the SE mode? I have not yet seen any SPARC implementation in GEM5 using the InOrder Detailed CPU. Other than the timing of the pipeline stages, can I be functionally correct by running the SPARC in “timing” model of the “SimpleCPU” and calculate the power using McPAT? - For assigning multiple workloads in SE mode, I have tried just simply increasing “np” to 2 and running 2 “hello_world” binaries. In the stats file, I see only data for “cpu0” and “cpu1” seems to be idle (all 0). - Where should I be looking at to “Detailed MinorCPU” implementation for SPARC? /Monir On 8/2/16, 11:00 AM, "gem5-users on behalf of gem5-users-requ...@gem5.org"wrote: Send gem5-users mailing list submissions to gem5-users@gem5.org To subscribe or unsubscribe via the World Wide Web, visit http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users or, via email, send a message with subject or body 'help' to gem5-users-requ...@gem5.org You can reach the person managing the list at gem5-users-ow...@gem5.org When replying, please edit your Subject line so it is more specific than "Re: Contents of gem5-users digest..." Today's Topics: 1. Making a C application work on multiple cores with gem5 (anoir nechi) 2. Making a C application work on multiple cores with gem5 (anoir nechi) 3. Virtual to Physical Address in ARMv8 FS Classic Memory (Vanchinathan Venkataramani) 4. Re: Making a C application work on multiple cores withgem5 (Jason Lowe-Power) 5. Re: Making a C application work on multiple cores withgem5 (Jason Lowe-Power) 6. Re: Physmem in SE Mode (Jason Lowe-Power) 7. Re: Making a C application work on multiple cores withgem5 (anoir nechi) -- Message: 1 Date: Tue, 2 Aug 2016 11:17:55 +0200 From: anoir nechi To: gem5 users mailing list , m5-us...@m5sim.org Subject: [gem5-users] Making a C application work on multiple cores with gem5 Message-ID:
Re: [gem5-users] Making a C application work on multiple cores with gem5
Hi Jason for the cores not simultaneously. i read abouat what gem5 support so i will try for example 4 cores of x86 then 4 cores of ARM. And by the way do you have any idea about ARM SIMD instructions please? Thank you On Tue, Aug 2, 2016 at 3:28 PM, Jason Lowe-Powerwrote: > Hello, > > If you're using full system mode (FS mode), you can use pthreads or any > other threading library just like on a real machine. If you're using > syscall emulation (SE) mode, then you can use the m5threads library which > is a pthreads-like library (http://repo.gem5.org/m5threads/). > > If I've misunderstood your question and you want to try to use x86 and ARM > cores simultaneously... that currently isn't supported by gem5. > > Jason > > On Tue, Aug 2, 2016 at 4:18 AM anoir nechi wrote: > >> hello >> >> I am new with gem5 simulator. I have a C application that i want to make >> it run faster. So the first thing I've done is to optimize it using several >> techniques like loop unrolling and SIMD. And the next step, i intend to >> make it work on *multiple cores* (*X86* and *ARM*) for that i must use >> the gem5 simulator. >> >> The application is for Radix4 computing. For now I've succeeded to make >> it work on one core systems for *X86* and *ARM* but, now i want to make >> it work on 4, 16, ... cores X86 or ARM. >> >> could someone give me some hints or show me the right way to do this? >> Thank you >> >> ___ >> gem5-users mailing list >> gem5-users@gem5.org >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users > -- *Anouar NECHI* *IT Engineer : Industrial systemsHigher Institute of Computer ScienceTunis - El Manar University* *Phone :* *(+216) 50 311 536* *E-mail :* *anoirne...@gmail.com * ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Physmem in SE Mode
Hi Monir, The AbstractMemory class (along with the System class) implements the physical memory of the system. When configuring gem5, if you instantiate a memory object (e.g., DRAMCtrl like DDR3_1600_x64) this object will register the physical memory with the System object. The with the DRAMCtrl, you can configure both the size and the location in the address space of the physical memory. For configuring a system like a SPARC T1... There isn't anything out of the box that will "just work". You'll have to dig into the CPU options (probably the MinorCPU since the T1 was in-order) to see if you can enable threading and configure it like the T1. Jason On Mon, Aug 1, 2016 at 10:31 AM Zaman, Monirwrote: > Hello all, > > I was running the example/se.py script for my test cases, and I don’t see > the “physmem” stats which mimics the DRAM. I do see a 512MB value for > memory, but how do I setup the Physical Memory (Main Memory) in the setup? > > Also, how do I set up the Hardware threading to mimic the SPARC T1 > processor? > > > > Thanks > > Monir > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
Re: [gem5-users] Making a C application work on multiple cores with gem5
Hello, If you're using full system mode (FS mode), you can use pthreads or any other threading library just like on a real machine. If you're using syscall emulation (SE) mode, then you can use the m5threads library which is a pthreads-like library (http://repo.gem5.org/m5threads/). If I've misunderstood your question and you want to try to use x86 and ARM cores simultaneously... that currently isn't supported by gem5. Jason On Tue, Aug 2, 2016 at 4:18 AM anoir nechiwrote: > hello > > I am new with gem5 simulator. I have a C application that i want to make > it run faster. So the first thing I've done is to optimize it using several > techniques like loop unrolling and SIMD. And the next step, i intend to > make it work on *multiple cores* (*X86* and *ARM*) for that i must use > the gem5 simulator. > > The application is for Radix4 computing. For now I've succeeded to make it > work on one core systems for *X86* and *ARM* but, now i want to make it > work on 4, 16, ... cores X86 or ARM. > > could someone give me some hints or show me the right way to do this? > Thank you > > ___ > gem5-users mailing list > gem5-users@gem5.org > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Virtual to Physical Address in ARMv8 FS Classic Memory
Hi all I am currently running an application on 64 core ARMv8 FS with Classic Memory with individual L1 D and I Cache and unified L2 cache. On looking at the cache memory trace, two virtual addresses, one from Kernel space (e.g. 0xffc071a63400) and one from application space (e.g. 0x915400) are mapped to the same physical address (e.g. 0xf1a63400) The *kernel memory access* occurs first and ends as a *cache miss*. However, the first access to the application memory address ends up as a *cache hit. *I double checked with the cache trace and statistics to confirm this. One explanation is that these belong to two different threads and hence can have the same physical address due to context switching. However, if that is the case, access to the application address should end up as a miss (which is not the case). Any explanation is greatly appreciated. Thanks a lot in advance. ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
[gem5-users] Making a C application work on multiple cores with gem5
hello I am new with gem5 simulator. I have a C application that i want to make it run faster. So the first thing I've done is to optimize it using several techniques like loop unrolling and SIMD. And the next step, i intend to make it work on *multiple cores* (*X86* and *ARM*) for that i must use the gem5 simulator. The application is for Radix4 computing. For now I've succeeded to make it work on one core systems for *X86* and *ARM* but, now i want to make it work on 4, 16, ... cores X86 or ARM. could someone give me some hints or show me the right way to do this? Thank you ___ gem5-users mailing list gem5-users@gem5.org http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users