BTW, thanks for the detailed example... I've been traveling, but I'll see if
I can reproduce this when I get home.

Steve

On Thu, Feb 17, 2011 at 11:21 AM, Richard Strong <[email protected]>wrote:

> Here is the process I went through on a fresh checkout of m5 this morning.
>
> (1) hg clone http://repo.m5sim.org/m5
>
> (2) cd m5
>
> (3) scons build/ALPHA_SE/m5.opt
>
> (4) build/ALPHA_SE/m5.opt  configs/example/se.py  --take-checkpoint=1
> --at-instruction
>
> (5) build/ALPHA_SE/m5.opt  configs/example/se.py  --checkpoint-restore=1
> --at-instruction  -d --caches --l2cache
> M5 Simulator System
>
> Copyright (c) 2001-2008
> The Regents of The University of Michigan
> All Rights Reserved
>
>
> M5 compiled Feb 17 2011 09:41:58
> M5 revision 96bde0910197+ 8031+ default tip
> M5 started Feb 17 2011 09:54:32
> M5 executing on rstrong-desktop
> command line: build/ALPHA_SE/m5.opt configs/example/se.py
> --checkpoint-restore=1 --at-instruction -d --caches --l2cache
>
> Global frequency set at 1000000000000 ticks per second
> 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
> Switch at curTick count:10000
> info: Entering event queue @ 1000.  Starting simulation...
> panic: Tried to access unmapped address 0x12008b488.
>  @ cycle 2500
> [invoke:build/ALPHA_SE/arch/alpha/faults.cc, line 208]
> Memory Usage: 586300 KBytes
> For more information see: http://www.m5sim.org/panic/5932f339
> Program aborted at cycle 2500
> Aborted
>
> The problem seen in the output of (5) above is caused by the workload being
> adopted by switch_cpus as its parent as opposed to system.cpu. My original
> fix was to modify simulate.py  to adopt orphans in sorted order, but this
> appears to create orphans for fuPool as shown in the snippet of config.ini
> below. This makes me think that something is broken in the design as it
> depends on the order in which objects come up if certain objects become
> orphans or if checkpoint files work. Is there any way to explicitly set the
> parent, child relationship if you want to avoid this non determinism.
>
> config.ini selected output:
> [system.switch_cpus.fuPool]
> type=FUPool
> FUList=(orphan) (orphan) (orphan) (orphan) (orphan) (orphan) (orphan)
> (orphan) (orphan)
>
>
>
>
>
>
> On Thu, Feb 17, 2011 at 5:45 AM, Steve Reinhardt <[email protected]> wrote:
>
>> Hi Rick,
>>
>> I'm a little confused by your statement "there is no recursion to add the
>> children of params".  Being param value and being a child are separate
>> things, because an object A can be a param of many other objects but can
>> only be the child of one other object.  The only relationship between the
>> two is that if A is set as a param value for a param of B and A does not
>> have a parent, then A will also implicitly be set as a child of B.  (See
>> towards the end of SimObject.__setattr__().)
>>
>> So every SimObject param value *should* be the child of *some* SimObject,
>> so iterating over param values shouldn't be necessary.  The whole point of
>> adoptOrphanParams() is to make sure this is true; it's the one place we
>> iterate over all the param values, just to make sure that they all have
>> parents (and to set them if they don't).
>>
>> Also, the adoptOrphanParams() method traverses the whole tree (see
>> simulate.py) using the descendants() call which is a pre-order traversal, so
>> any new children that are added at a particular node should be traversed
>> automatically.
>>
>> Your configuration should not be affected by whether you're restoring from
>> a checkpoint or not... the config gets built first, then if there's a
>> checkpoint it gets restored.
>>
>> I rewrote all this code last summer to clean it up, so I'm very interested
>> in figuring out where the bugs are.
>>
>> Steve
>>
>>
>> On Wed, Feb 16, 2011 at 9:48 PM, Richard Strong <[email protected]>wrote:
>>
>>> I took a close look at this problem because the same thing happens to me.
>>> It only occurs when I use the O3CPU model when resuming from a checkpoint.
>>> What I find is that config.ini has orphan for the FUList parameter of the
>>> O3CPU model. Further, none of the function units are adopted by fuPool. I
>>> think the problem lies in SimObject.py::add_child(self, name, child) and
>>> SimObject.py::
>>> adoptOrphanParams(self). I think that there is no recursion to add the
>>> children of params. I tried a simple change at the end of add_child, that I
>>> adoptOrphanParams() of the child (change showed below). This allows the
>>> setup code to get further but now I die with:
>>>
>>> "AttributeError: 'AnyProxy' object has no attribute 'getValue'. I was
>>> wondering if someone knows what is going wrong? Did a recent change forget
>>> to go down enough recursive levels when adopting children nodes?
>>>
>>> Best,
>>> -Rick
>>>
>>> def add_child(self, name, child):
>>>         print "\t in add_child name=%s child=%s"%(name, child)
>>>         child = coerceSimObjectOrVector(child)
>>>         if child.get_parent():
>>>             raise RuntimeError, \
>>>                   "add_child('%s'): child '%s' already has parent '%s'" %
>>> \
>>>                   (name, child._name, child._parent)
>>>         if self._children.has_key(name):
>>>             # This code path had an undiscovered bug that would make it
>>> fail
>>>             # at runtime. It had been here for a long time and was only
>>>             # exposed by a buggy script. Changes here will probably not
>>> be
>>>             # exercised without specialized testing.
>>>             self.clear_child(name)
>>>         child.set_parent(self, name)
>>>         self._children[name] = child
>>>         if isSimObjectVector(child):
>>>             for obj in child:
>>>                 obj.adoptOrphanParams()
>>>         elif isSimObjectOrVector(child):
>>>             child.adoptOrphanParams()
>>>
>>>>
>>>>
>>>> On Fri, Feb 11, 2011 at 11:05 PM, Joel Hestness <[email protected]
>>>> > wrote:
>>>>
>>>>> Hi Sheng,
>>>>>   I've dug back through some of my simulations, and I haven't been able
>>>>> to find a case where I used 4GB of simulated memory, so I don't know if I
>>>>> have a baseline to show that the checkpoint restore works with that much
>>>>> memory.  On the other hand, I have simulated with 512MB and 1GB of 
>>>>> simulated
>>>>> memory, and it has worked fine.  For full-system simulations, we often 
>>>>> mount
>>>>> a swap disk in the simulated system in order to avoid the small virtual
>>>>> memory constraints imposed by the operating system.  I'd have to defer to
>>>>> others on the list for knowledge about whether that would work with SE 
>>>>> mode.
>>>>>   I can attempt to address your other questions as well:
>>>>>    1) The way that you described the O3 parameters is how I have set
>>>>> them in the past, so that should work.
>>>>>    2) I've seen this problem before... It has had to do with the way
>>>>> that certain SimObjects are instantiated as children of other SimObjects 
>>>>> at
>>>>> the beginning of the simulation, and with checkpoint restore, this isn't 
>>>>> the
>>>>> cleanest process.  When I ran into this problem, I was working on getting
>>>>> x86 timing mode working with Ruby, and Brad Beckmann was able to help me
>>>>> debug.  He might be able to suggest first steps for figuring out what's
>>>>> wrong here.
>>>>>   Hope this helps,
>>>>>   Joel
>>>>>
>>>>>
>>>>> On Wed, Feb 9, 2011 at 3:14 PM, Sheng Li <[email protected]> wrote:
>>>>>
>>>>>> An two other questions:
>>>>>>
>>>>>> 1. What should I do to change the O3 parameters such as issueWidth,
>>>>>> commitWidth, etc? I added a few lines in se.py as below. It runs fine if 
>>>>>> I
>>>>>> just run the benchmarks, but if I resume a checkpoint (created without -d
>>>>>> option), then it will complain the CPU class has no such parameters. I 
>>>>>> think
>>>>>> these parameters can only be set after M5 performs CPU mode switch, then 
>>>>>> how
>>>>>> can I set these parameters so that M5 will use them after switching CPU
>>>>>> mode?
>>>>>>
>>>>>>  if options.detailed:
>>>>>>     CPUClass.commitWidth    = 4
>>>>>>     CPUClass.decodeWidth    = 4
>>>>>>     CPUClass.dispatchWidth  = 4
>>>>>>     CPUClass.fetchWidth     = 4
>>>>>>     CPUClass.issueWidth     = 4
>>>>>>     CPUClass.commitWidth    = 4
>>>>>>     CPUClass.renameWidth    = 4
>>>>>>     CPUClass.squashWidth    = 4
>>>>>>     CPUClass.wbWidth        = 4
>>>>>>     CPUClass.numROBEntries  = 128
>>>>>>     CPUClass.numIQEntries   = 36
>>>>>>     CPUClass.LQEntries      = 48
>>>>>>
>>>>>> 2. When I resume a checkpoint with -d --caches options, I got
>>>>>> RuntimeError: Attempt to instantiate orphan node. I am trying to figure 
>>>>>> out
>>>>>> what the orphan node is. What should I do to find the orphan node? I 
>>>>>> tried
>>>>>> "print self.name" in File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line
>>>>>> 822, in getCCObject, but got nothing.
>>>>>>
>>>>>>
>>>>>> command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench
>>>>>> bzip2 --checkpoint-restore=0 --simpoint -d --caches --l2cache
>>>>>> 2200
>>>>>> m5out/cpt.bzip2.2200
>>>>>>
>>>>>> Global frequency set at 1000000000000 ticks per second
>>>>>>  Traceback (most recent call last):
>>>>>>   File "<string>", line 1, in ?
>>>>>>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/main.py",
>>>>>> line 359, in main
>>>>>>     exec filecode in scope
>>>>>>   File "configs/example/se.py", line 179, in ?
>>>>>>     Simulation.run(options, root, system, FutureClass)
>>>>>>   File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-work-stable/configs/common/Simulation.py",
>>>>>> line 236, in run
>>>>>>     m5.instantiate(checkpoint_dir)
>>>>>>   File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-work-stable/src/python/m5/simulate.py",
>>>>>> line 77, in instantiate
>>>>>>     for obj in root.descendants(): obj.createCCObject()
>>>>>>   File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line
>>>>>> 841, in createCCObject
>>>>>>     def createCCObject(self):
>>>>>>   File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line
>>>>>> 796, in getCCParams
>>>>>>     value = value.getValue()
>>>>>>   File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line
>>>>>> 845, in getValue
>>>>>>     def getValue(self):
>>>>>>   File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line
>>>>>> 826, in getCCObject
>>>>>>     self._ccObject = -1
>>>>>>   File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line
>>>>>> 796, in getCCParams
>>>>>>     value = value.getValue()
>>>>>>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/params.py",
>>>>>> line 183, in getValue
>>>>>>     return [ v.getValue() for v in self ]
>>>>>>   File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line
>>>>>> 845, in getValue
>>>>>>     def getValue(self):
>>>>>>   File "/afs/
>>>>>> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line
>>>>>> 822, in getCCObject
>>>>>>     #print self.name
>>>>>> RuntimeError: Attempt to instantiate orphan node
>>>>>>
>>>>>> Thanks a lot!
>>>>>> -Sheng
>>>>>>
>>>>>
>>
>
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Reply via email to