I took a close look at this problem because the same thing happens to me. It
only occurs when I use the O3CPU model when resuming from a checkpoint. What
I find is that config.ini has orphan for the FUList parameter of the O3CPU
model. Further, none of the function units are adopted by fuPool. I think
the problem lies in SimObject.py::add_child(self, name, child) and
SimObject.py::
adoptOrphanParams(self). I think that there is no recursion to add the
children of params. I tried a simple change at the end of add_child, that I
adoptOrphanParams() of the child (change showed below). This allows the
setup code to get further but now I die with:
"AttributeError: 'AnyProxy' object has no attribute 'getValue'. I was
wondering if someone knows what is going wrong? Did a recent change forget
to go down enough recursive levels when adopting children nodes?
Best,
-Rick
def add_child(self, name, child):
print "\t in add_child name=%s child=%s"%(name, child)
child = coerceSimObjectOrVector(child)
if child.get_parent():
raise RuntimeError, \
"add_child('%s'): child '%s' already has parent '%s'" % \
(name, child._name, child._parent)
if self._children.has_key(name):
# This code path had an undiscovered bug that would make it fail
# at runtime. It had been here for a long time and was only
# exposed by a buggy script. Changes here will probably not be
# exercised without specialized testing.
self.clear_child(name)
child.set_parent(self, name)
self._children[name] = child
if isSimObjectVector(child):
for obj in child:
obj.adoptOrphanParams()
elif isSimObjectOrVector(child):
child.adoptOrphanParams()
>
>
> On Fri, Feb 11, 2011 at 11:05 PM, Joel Hestness <[email protected]>wrote:
>
>> Hi Sheng,
>> I've dug back through some of my simulations, and I haven't been able to
>> find a case where I used 4GB of simulated memory, so I don't know if I have
>> a baseline to show that the checkpoint restore works with that much memory.
>> On the other hand, I have simulated with 512MB and 1GB of simulated memory,
>> and it has worked fine. For full-system simulations, we often mount a swap
>> disk in the simulated system in order to avoid the small virtual memory
>> constraints imposed by the operating system. I'd have to defer to others on
>> the list for knowledge about whether that would work with SE mode.
>> I can attempt to address your other questions as well:
>> 1) The way that you described the O3 parameters is how I have set them
>> in the past, so that should work.
>> 2) I've seen this problem before... It has had to do with the way that
>> certain SimObjects are instantiated as children of other SimObjects at the
>> beginning of the simulation, and with checkpoint restore, this isn't the
>> cleanest process. When I ran into this problem, I was working on getting
>> x86 timing mode working with Ruby, and Brad Beckmann was able to help me
>> debug. He might be able to suggest first steps for figuring out what's
>> wrong here.
>> Hope this helps,
>> Joel
>>
>>
>> On Wed, Feb 9, 2011 at 3:14 PM, Sheng Li <[email protected]> wrote:
>>
>>> An two other questions:
>>>
>>> 1. What should I do to change the O3 parameters such as issueWidth,
>>> commitWidth, etc? I added a few lines in se.py as below. It runs fine if I
>>> just run the benchmarks, but if I resume a checkpoint (created without -d
>>> option), then it will complain the CPU class has no such parameters. I think
>>> these parameters can only be set after M5 performs CPU mode switch, then how
>>> can I set these parameters so that M5 will use them after switching CPU
>>> mode?
>>>
>>> if options.detailed:
>>> CPUClass.commitWidth = 4
>>> CPUClass.decodeWidth = 4
>>> CPUClass.dispatchWidth = 4
>>> CPUClass.fetchWidth = 4
>>> CPUClass.issueWidth = 4
>>> CPUClass.commitWidth = 4
>>> CPUClass.renameWidth = 4
>>> CPUClass.squashWidth = 4
>>> CPUClass.wbWidth = 4
>>> CPUClass.numROBEntries = 128
>>> CPUClass.numIQEntries = 36
>>> CPUClass.LQEntries = 48
>>>
>>> 2. When I resume a checkpoint with -d --caches options, I got
>>> RuntimeError: Attempt to instantiate orphan node. I am trying to figure out
>>> what the orphan node is. What should I do to find the orphan node? I tried
>>> "print self.name" in File "/afs/
>>> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line 822,
>>> in getCCObject, but got nothing.
>>>
>>>
>>> command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench bzip2
>>> --checkpoint-restore=0 --simpoint -d --caches --l2cache
>>> 2200
>>> m5out/cpt.bzip2.2200
>>>
>>> Global frequency set at 1000000000000 ticks per second
>>> Traceback (most recent call last):
>>> File "<string>", line 1, in ?
>>> File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/main.py",
>>> line 359, in main
>>> exec filecode in scope
>>> File "configs/example/se.py", line 179, in ?
>>> Simulation.run(options, root, system, FutureClass)
>>> File "/afs/
>>> crc.nd.edu/user/s/sli2/m5-work-stable/configs/common/Simulation.py",
>>> line 236, in run
>>> m5.instantiate(checkpoint_dir)
>>> File "/afs/
>>> crc.nd.edu/user/s/sli2/m5-work-stable/src/python/m5/simulate.py", line
>>> 77, in instantiate
>>> for obj in root.descendants(): obj.createCCObject()
>>> File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
>>> line 841, in createCCObject
>>> def createCCObject(self):
>>> File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
>>> line 796, in getCCParams
>>> value = value.getValue()
>>> File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
>>> line 845, in getValue
>>> def getValue(self):
>>> File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
>>> line 826, in getCCObject
>>> self._ccObject = -1
>>> File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
>>> line 796, in getCCParams
>>> value = value.getValue()
>>> File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/params.py",
>>> line 183, in getValue
>>> return [ v.getValue() for v in self ]
>>> File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
>>> line 845, in getValue
>>> def getValue(self):
>>> File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
>>> line 822, in getCCObject
>>> #print self.name
>>> RuntimeError: Attempt to instantiate orphan node
>>>
>>> Thanks a lot!
>>> -Sheng
>>>
>>>
>>>
>>> On Wed, Feb 9, 2011 at 4:03 PM, Sheng Li <[email protected]> wrote:
>>>
>>>> Thanks Joel!
>>>>
>>>> Yes, I did. The checkpoint created with 4096MB has problem as lots of
>>>> information is missing. Is it possible that checkpoint does not support
>>>> larger memory (i.e 4096MB) in M5?
>>>>
>>>> Thanks
>>>> -Sheng
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Feb 9, 2011 at 3:31 PM, Joel Hestness
>>>> <[email protected]>wrote:
>>>>
>>>>> Hi Sheng,
>>>>> Did you collect the checkpoints from a simulated system with 512MB of
>>>>> memory? The checkpoints encode the current state of memory in the
>>>>> simulated
>>>>> system including the capacity, so you'll need to make sure that the
>>>>> simulated system in both runs (to collect the checkpoint and to restore
>>>>> from
>>>>> it) use the same amount of simulated memory.
>>>>> More generally, an M5 checkpoint is specific to the ISA/architecture,
>>>>> number of cores, and the capacity of memory in the simulated system that
>>>>> you
>>>>> collect the checkpoint from.
>>>>> Hope this helps,
>>>>> Joel
>>>>>
>>>>>
>>>>> On Wed, Feb 9, 2011 at 12:41 PM, Sheng Li <[email protected]> wrote:
>>>>>
>>>>>> After spending several hours to guess what was wrong, here are my
>>>>>> findings:
>>>>>>
>>>>>> It seems that if I set PhysicalMemory as 512MB, checkpointing can
>>>>>> work. However, if I set it as 4096MB (I did this because SPECCPU2006
>>>>>> requires at least 2GB free memory), checkpoint will not work. The place I
>>>>>> changed this is in common/example/se.py
>>>>>>
>>>>>> system = System(cpu = [CPUClass(cpu_id=i) for i in xrange(np)],
>>>>>> physmem = PhysicalMemory(range=AddrRange("4096MB")),
>>>>>> membus = Bus(), mem_mode = test_mem_mode)
>>>>>>
>>>>>> Could anyone give some suggestions?
>>>>>>
>>>>>> Thanks!
>>>>>> -Sheng
>>>>>>
>>>>>>
>>>>>>
>>>>>> On Wed, Feb 9, 2011 at 12:05 AM, Sheng Li <[email protected]>wrote:
>>>>>>
>>>>>>> Hi Guys,
>>>>>>>
>>>>>>> I tried to use checkpoints in M5 but could not have it work. I used
>>>>>>> ALPHA_SE.
>>>>>>>
>>>>>>> The commands I use to create/resume checkpoints are M5 outputs are:
>>>>>>>
>>>>>>> Creating checkpoint:
>>>>>>> ______________________
>>>>>>> [sli2@newcell ~/m5-work-stable]$ ./build/ALPHA_SE/m5.opt
>>>>>>> configs/example/se.py --bench bzip2 --take-checkpoint=2200
>>>>>>> --at-instruction
>>>>>>> ...
>>>>>>> command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench
>>>>>>> bzip2 --take-checkpoint=2200 --at-instruction
>>>>>>> 2200000000
>>>>>>> Global frequency set at 1000000000000 ticks per second
>>>>>>> 0: system.remote_gdb.listener: listening for remote gdb #0 on port
>>>>>>> 7000
>>>>>>> Creating checkpoint at inst:2200
>>>>>>> info: Entering event queue @ 0. Starting simulation...
>>>>>>> info: Increasing stack size by one page.
>>>>>>> hack: be nice to actually delete the event here
>>>>>>> exit cause = a thread reached the max instruction count
>>>>>>> Writing checkpoint
>>>>>>> Checkpoint written.
>>>>>>> Exiting @ cycle 1111000 because a thread reached the max instruction
>>>>>>> count
>>>>>>>
>>>>>>> Resume checkpoint:
>>>>>>> _________________________
>>>>>>> command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench
>>>>>>> bzip2 --checkpoint-restore=2200 --at-instruction
>>>>>>> 2200000000
>>>>>>> Global frequency set at 1000000000000 ticks per second
>>>>>>> 0: system.remote_gdb.listener: listening for remote gdb #0 on port
>>>>>>> 7000
>>>>>>> warn: optional parameter system.cpu.workload:M5_pid not present
>>>>>>> For more information see: http://www.m5sim.org/warn/aa78cda1
>>>>>>> **** REAL SIMULATION ****
>>>>>>> info: Entering event queue @ 1111000. Starting simulation...
>>>>>>> hack: be nice to actually delete the event here
>>>>>>> Exiting @ cycle 1111500 because halt instruction encountered <--Here
>>>>>>> is the problem.
>>>>>>>
>>>>>>> Any help would be highly appreciated!
>>>>>>>
>>>>>>> Thanks
>>>>>>> -Sheng
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> m5-users mailing list
>>>>>> [email protected]
>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Joel Hestness
>>>>> PhD Student, Computer Architecture
>>>>> Dept. of Computer Science, University of Texas - Austin
>>>>> http://www.cs.utexas.edu/~hestness
>>>>>
>>>>> _______________________________________________
>>>>> m5-users mailing list
>>>>> [email protected]
>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>>>
>>>>
>>>>
>>>
>>> _______________________________________________
>>> m5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>
>>
>>
>>
>> --
>> Joel Hestness
>> PhD Student, Computer Architecture
>> Dept. of Computer Science, University of Texas - Austin
>> http://www.cs.utexas.edu/~hestness
>>
>> _______________________________________________
>> m5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>
>
>
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users