[m5-users] Ruby questions
Hi everyone, I have a few updated questions to ask. This is particularly aimed at Brad, but if anyone else knows, I would love some answers. Why are there n+1 switches rather than just one in the crossbar definition? That is to ask, why couldn't this snippet: ext_links = [ExtLink(ext_node=n, int_node=i) for (i, n) in enumerate(nodes)] xbar = len(nodes) # node ID for crossbar switch int_links = [IntLink(node_a=i, node_b=xbar) for i in range(len(nodes))] return Crossbar(ext_links=ext_links, int_links=int_links, num_int_nodes=len(nodes)+1) Instead be implemented as so?: ext_links = [ExtLink(ext_node=n, int_node=0) for (i, n) in enumerate(nodes)] int_links = [] return Crossbar(ext_links=ext_links, int_links=int_links, num_int_nodes=1) What is the status of garnet? I haven't seen much discussion of it on either mailing list. Is a Bus interconnect possible using either of the ruby network models? From what I can tell it's not. The gems-users mailing list seemed to agree. -Joseph ___ m5-users mailing list m5-users@m5sim.org http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
[m5-users] Potential typo in configs/example/fs.py and assertion failure when doing StandardSurge
Hi all, I was trying to boot a dual system configuration with fs.py to run StandardSurge and it fails. The first failuire was caused by the line in fs.py:drive_sys.cpu. connectMemPorts(drive_sys.membus). There was no function connectMemPorts in BaseCPU.py. I think I resolved this problem by changing it to connectAllPorts. The simulation gets a little further when the drive system fails with the output shown below M5 Output. I provide a trace of Exec,Cache,Bus for the drive_sys below which indicates that the failure was caused by an attempted access by the drive_sys to address 0xf00188. The only other reference to this problem is seen in http://www.mail-archive.com/m5sim-users@lists.sourceforge.net/msg01082.htmlwhere a user was attempting to implement a new device in the system. I also provide the kernel output for the drive_sys under the heading Kernel Output. My guess is that this code hasn't been tested for a while as the python script fails or it is tested with a script other than fs.py. If anyone is able to successfully runch StandardSurge and SpecSurge using fs.py, has an idea of what is going wrong, or some debugging suggestions, I am all ears. Thanks, -Rick M5 Output command line: /home/rstrong/build/m5-idle/build/ALPHA_FS/m5.opt configs/example/fs.py --benchmark=SurgeStandard maxtick= 9223372036854775807 Global frequency set at 1 ticks per second info: kernel located at: /home/rstrong/dist/m5/system/binaries/vmlinux-m5 Listening for testsys connection on port 3456 0: testsys.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 info: kernel located at: /home/rstrong/dist/m5/system/binaries/vmlinux-m5 Listening for drivesys connection on port 3457 0: drivesys.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 0: testsys.remote_gdb.listener: listening for remote gdb #0 on port 7000 0: drivesys.remote_gdb.listener: listening for remote gdb #1 on port 7001 simulating till maxtick:9223372036854775807 REAL SIMULATION info: Entering event queue @ 0. Starting simulation... m5.opt: /home/rstrong/build/m5-idle/build/ALPHA_FS/cpu/simple/atomic.cc:347: Fault AtomicSimpleCPU::readBytes(Addr, uint8_t*, unsigned int, unsigned int): Assertion `!pkt.isError()' failed. Program aborted at cycle 96978500 Aborted --trace-flags=EXEC,Cache,Bus trace: 0: drivesys.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 96968500: drivesys.cpu: Tick 96968500: drivesys.cpu T0 : @sys_machine_check+109: hw_mtpr r14,IPR(0x150) : IprAccess : D=0x00980679 96969000: drivesys.cpu: Tick 96969000: drivesys.cpu T0 : @sys_machine_check+113: ldah r14,-16(r31): IntAlu : D=0xfff0 96969500: drivesys.cpu: Tick 96969500: drivesys.cpu T0 : @sys_machine_check+117: hw_mtpr r1,IPR(0x141) : IprAccess : D=0x1000 9697: drivesys.cpu: Tick 9697: drivesys.cpu T0 : @sys_machine_check+121: zap r38,224,r38 : IntAlu : D=0x00f0 96970500: drivesys.cpu: Tick 96970500: drivesys.cpu T0 : @sys_machine_check+125: hw_mtpr r12,IPR(0x14a) : IprAccess : D=0x42d1 96971000: drivesys.cpu: Tick 96971000: drivesys.cpu T0 : @sys_machine_check+129: hw_mtpr r31,IPR(0x119) : IprAccess : D=0x 96971500: drivesys.cpu: Tick 96971500: drivesys.cpu T0 : @sys_machine_check+133: blbs r37,0x9321 : IntAlu : 96972000: drivesys.cpu: Tick 96972000: drivesys.cpu T0 : @sys_machine_check+137: hw_mtpr r6,IPR(0x146) : IprAccess : D=0x0072 96972500: drivesys.cpu: Tick 96972500: drivesys.cpu T0 : @sys_machine_check+141: hw_mtpr r5,IPR(0x145) : IprAccess : D=0x0071 96973000: drivesys.cpu: Tick 96973000: drivesys.cpu T0 : @sys_machine_check+145: hw_mfpr IPR(0x100),r4 : IprAccess : D=0x 96973500: drivesys.cpu: Tick 96973500: drivesys.cpu T0 : @sys_machine_check+149: srl r4,30,r4: IntAlu : D=0x 96974000: drivesys.cpu: Tick 96974000: drivesys.cpu T0 : @sys_machine_check+153: blbc r4,0x8f21 : IntAlu : 96974500: drivesys.cpu: Tick 96974500: drivesys.cpu T0 : @sys_mchk_collect_iprs+1: mb : MemRead : 96975000: drivesys.cpu: Tick 96975000: drivesys.cpu T0 : @sys_mchk_collect_iprs+5: hw_mfpr IPR(0x11a),r1 : IprAccess : D=0x3800 96975500: drivesys.cpu: Tick 96975500: drivesys.cpu T0 : @sys_mchk_collect_iprs+9: hw_mfpr IPR(0x212),r8 : IprAccess : D=0x0003 96976000: drivesys.cpu: Tick 96976000: drivesys.cpu T0 : @sys_mchk_collect_iprs+13: hw_mtpr r31,IPR(0x210) : IprAccess : D=0x 96976500: drivesys.cpu: Tick 96976500: drivesys.cpu T0 : @sys_mchk_collect_iprs+17: hw_mfpr IPR(0x140),r31 : IprAccess : D=0x005e 96977000: drivesys.cpu: Tick 96977000: drivesys.cpu T0 : @sys_mchk_collect_iprs+21: hw_mfpr IPR(0x140),r31 : IprAccess : D=0x005e 96977500: drivesys.cpu: Tick 96977500:
Re: [m5-users] Potential typo in configs/example/fs.py and assertion failure when doing StandardSurge
I committed a change recently that replaced connectMemPorts with a small group of functions, the main one being connectAllPorts. The fact that fs.py doesn't refer to it makes me think you're using a copy of fs.py and not the original script. My changeset is here: http://repo.m5sim.org/m5/rev/189b9b258779 And the file is here: http://repo.m5sim.org/m5/file/189b9b258779/configs/example/fs.py I verified that the regressions worked and tried a few other things, but it's quite possible I missed something. These scripts do a lot, not all of which I know how to use. Could you try using the updated fs.py and see if that works? And if not could you roll back to before my change and see if the original fs.py used to work? Basically I'd like to see if fs.py has been broken, if I broke it recently, or if your local copy is just out of date. Gabe Quoting Richard Strong rstr...@cs.ucsd.edu: Hi all, I was trying to boot a dual system configuration with fs.py to run StandardSurge and it fails. The first failuire was caused by the line in fs.py:drive_sys.cpu. connectMemPorts(drive_sys.membus). There was no function connectMemPorts in BaseCPU.py. I think I resolved this problem by changing it to connectAllPorts. The simulation gets a little further when the drive system fails with the output shown below M5 Output. I provide a trace of Exec,Cache,Bus for the drive_sys below which indicates that the failure was caused by an attempted access by the drive_sys to address 0xf00188. The only other reference to this problem is seen in http://www.mail-archive.com/m5sim-users@lists.sourceforge.net/msg01082.htmlwhere a user was attempting to implement a new device in the system. I also provide the kernel output for the drive_sys under the heading Kernel Output. My guess is that this code hasn't been tested for a while as the python script fails or it is tested with a script other than fs.py. If anyone is able to successfully runch StandardSurge and SpecSurge using fs.py, has an idea of what is going wrong, or some debugging suggestions, I am all ears. Thanks, -Rick M5 Output command line: /home/rstrong/build/m5-idle/build/ALPHA_FS/m5.opt configs/example/fs.py --benchmark=SurgeStandard maxtick= 9223372036854775807 Global frequency set at 1 ticks per second info: kernel located at: /home/rstrong/dist/m5/system/binaries/vmlinux-m5 Listening for testsys connection on port 3456 0: testsys.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 info: kernel located at: /home/rstrong/dist/m5/system/binaries/vmlinux-m5 Listening for drivesys connection on port 3457 0: drivesys.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 0: testsys.remote_gdb.listener: listening for remote gdb #0 on port 7000 0: drivesys.remote_gdb.listener: listening for remote gdb #1 on port 7001 simulating till maxtick:9223372036854775807 REAL SIMULATION info: Entering event queue @ 0. Starting simulation... m5.opt: /home/rstrong/build/m5-idle/build/ALPHA_FS/cpu/simple/atomic.cc:347: Fault AtomicSimpleCPU::readBytes(Addr, uint8_t*, unsigned int, unsigned int): Assertion `!pkt.isError()' failed. Program aborted at cycle 96978500 Aborted --trace-flags=EXEC,Cache,Bus trace: 0: drivesys.tsunami.io.rtc: Real-time clock set to Thu Jan 1 00:00:00 2009 96968500: drivesys.cpu: Tick 96968500: drivesys.cpu T0 : @sys_machine_check+109: hw_mtpr r14,IPR(0x150) : IprAccess : D=0x00980679 96969000: drivesys.cpu: Tick 96969000: drivesys.cpu T0 : @sys_machine_check+113: ldah r14,-16(r31): IntAlu : D=0xfff0 96969500: drivesys.cpu: Tick 96969500: drivesys.cpu T0 : @sys_machine_check+117: hw_mtpr r1,IPR(0x141) : IprAccess : D=0x1000 9697: drivesys.cpu: Tick 9697: drivesys.cpu T0 : @sys_machine_check+121: zap r38,224,r38 : IntAlu : D=0x00f0 96970500: drivesys.cpu: Tick 96970500: drivesys.cpu T0 : @sys_machine_check+125: hw_mtpr r12,IPR(0x14a) : IprAccess : D=0x42d1 96971000: drivesys.cpu: Tick 96971000: drivesys.cpu T0 : @sys_machine_check+129: hw_mtpr r31,IPR(0x119) : IprAccess : D=0x 96971500: drivesys.cpu: Tick 96971500: drivesys.cpu T0 : @sys_machine_check+133: blbs r37,0x9321 : IntAlu : 96972000: drivesys.cpu: Tick 96972000: drivesys.cpu T0 : @sys_machine_check+137: hw_mtpr r6,IPR(0x146) : IprAccess : D=0x0072 96972500: drivesys.cpu: Tick 96972500: drivesys.cpu T0 : @sys_machine_check+141: hw_mtpr r5,IPR(0x145) : IprAccess : D=0x0071 96973000: drivesys.cpu: Tick 96973000: drivesys.cpu T0 : @sys_machine_check+145: hw_mfpr IPR(0x100),r4 : IprAccess : D=0x 96973500: drivesys.cpu: Tick 96973500: drivesys.cpu T0 : @sys_machine_check+149: srl r4,30,r4: IntAlu : D=0x 96974000: drivesys.cpu: Tick 96974000: drivesys.cpu T0 : @sys_machine_check+153: blbc r4,0x8f21
Re: [m5-users] Cannot resume checkpoint
Hi Sheng, I've dug back through some of my simulations, and I haven't been able to find a case where I used 4GB of simulated memory, so I don't know if I have a baseline to show that the checkpoint restore works with that much memory. On the other hand, I have simulated with 512MB and 1GB of simulated memory, and it has worked fine. For full-system simulations, we often mount a swap disk in the simulated system in order to avoid the small virtual memory constraints imposed by the operating system. I'd have to defer to others on the list for knowledge about whether that would work with SE mode. I can attempt to address your other questions as well: 1) The way that you described the O3 parameters is how I have set them in the past, so that should work. 2) I've seen this problem before... It has had to do with the way that certain SimObjects are instantiated as children of other SimObjects at the beginning of the simulation, and with checkpoint restore, this isn't the cleanest process. When I ran into this problem, I was working on getting x86 timing mode working with Ruby, and Brad Beckmann was able to help me debug. He might be able to suggest first steps for figuring out what's wrong here. Hope this helps, Joel On Wed, Feb 9, 2011 at 3:14 PM, Sheng Li sheng@gmail.com wrote: An two other questions: 1. What should I do to change the O3 parameters such as issueWidth, commitWidth, etc? I added a few lines in se.py as below. It runs fine if I just run the benchmarks, but if I resume a checkpoint (created without -d option), then it will complain the CPU class has no such parameters. I think these parameters can only be set after M5 performs CPU mode switch, then how can I set these parameters so that M5 will use them after switching CPU mode? if options.detailed: CPUClass.commitWidth= 4 CPUClass.decodeWidth= 4 CPUClass.dispatchWidth = 4 CPUClass.fetchWidth = 4 CPUClass.issueWidth = 4 CPUClass.commitWidth= 4 CPUClass.renameWidth= 4 CPUClass.squashWidth= 4 CPUClass.wbWidth= 4 CPUClass.numROBEntries = 128 CPUClass.numIQEntries = 36 CPUClass.LQEntries = 48 2. When I resume a checkpoint with -d --caches options, I got RuntimeError: Attempt to instantiate orphan node. I am trying to figure out what the orphan node is. What should I do to find the orphan node? I tried print self.name in File /afs/ crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 822, in getCCObject, but got nothing. command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench bzip2 --checkpoint-restore=0 --simpoint -d --caches --l2cache 2200 m5out/cpt.bzip2.2200 Global frequency set at 1 ticks per second Traceback (most recent call last): File string, line 1, in ? File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/main.py, line 359, in main exec filecode in scope File configs/example/se.py, line 179, in ? Simulation.run(options, root, system, FutureClass) File /afs/ crc.nd.edu/user/s/sli2/m5-work-stable/configs/common/Simulation.py, line 236, in run m5.instantiate(checkpoint_dir) File /afs/ crc.nd.edu/user/s/sli2/m5-work-stable/src/python/m5/simulate.py, line 77, in instantiate for obj in root.descendants(): obj.createCCObject() File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 841, in createCCObject def createCCObject(self): File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 796, in getCCParams value = value.getValue() File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 845, in getValue def getValue(self): File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 826, in getCCObject self._ccObject = -1 File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 796, in getCCParams value = value.getValue() File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/params.py, line 183, in getValue return [ v.getValue() for v in self ] File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 845, in getValue def getValue(self): File /afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py, line 822, in getCCObject #print self.name RuntimeError: Attempt to instantiate orphan node Thanks a lot! -Sheng On Wed, Feb 9, 2011 at 4:03 PM, Sheng Li sheng@gmail.com wrote: Thanks Joel! Yes, I did. The checkpoint created with 4096MB has problem as lots of information is missing. Is it possible that checkpoint does not support larger memory (i.e 4096MB) in M5? Thanks -Sheng On Wed, Feb 9, 2011 at 3:31 PM, Joel Hestness hestn...@cs.utexas.eduwrote: Hi Sheng, Did you collect the checkpoints from a simulated system with 512MB of memory? The checkpoints encode the current state of memory in the