Hi Sheng,
  I've dug back through some of my simulations, and I haven't been able to
find a case where I used 4GB of simulated memory, so I don't know if I have
a baseline to show that the checkpoint restore works with that much memory.
 On the other hand, I have simulated with 512MB and 1GB of simulated memory,
and it has worked fine.  For full-system simulations, we often mount a swap
disk in the simulated system in order to avoid the small virtual memory
constraints imposed by the operating system.  I'd have to defer to others on
the list for knowledge about whether that would work with SE mode.
  I can attempt to address your other questions as well:
   1) The way that you described the O3 parameters is how I have set them in
the past, so that should work.
   2) I've seen this problem before... It has had to do with the way that
certain SimObjects are instantiated as children of other SimObjects at the
beginning of the simulation, and with checkpoint restore, this isn't the
cleanest process.  When I ran into this problem, I was working on getting
x86 timing mode working with Ruby, and Brad Beckmann was able to help me
debug.  He might be able to suggest first steps for figuring out what's
wrong here.
  Hope this helps,
  Joel


On Wed, Feb 9, 2011 at 3:14 PM, Sheng Li <[email protected]> wrote:

> An two other questions:
>
> 1. What should I do to change the O3 parameters such as issueWidth,
> commitWidth, etc? I added a few lines in se.py as below. It runs fine if I
> just run the benchmarks, but if I resume a checkpoint (created without -d
> option), then it will complain the CPU class has no such parameters. I think
> these parameters can only be set after M5 performs CPU mode switch, then how
> can I set these parameters so that M5 will use them after switching CPU
> mode?
>
>  if options.detailed:
>     CPUClass.commitWidth    = 4
>     CPUClass.decodeWidth    = 4
>     CPUClass.dispatchWidth  = 4
>     CPUClass.fetchWidth     = 4
>     CPUClass.issueWidth     = 4
>     CPUClass.commitWidth    = 4
>     CPUClass.renameWidth    = 4
>     CPUClass.squashWidth    = 4
>     CPUClass.wbWidth        = 4
>     CPUClass.numROBEntries  = 128
>     CPUClass.numIQEntries   = 36
>     CPUClass.LQEntries      = 48
>
> 2. When I resume a checkpoint with -d --caches options, I got RuntimeError:
> Attempt to instantiate orphan node. I am trying to figure out what the
> orphan node is. What should I do to find the orphan node? I tried "print
> self.name" in File "/afs/
> crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py", line 822, in
> getCCObject, but got nothing.
>
>
> command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench bzip2
> --checkpoint-restore=0 --simpoint -d --caches --l2cache
> 2200
> m5out/cpt.bzip2.2200
>
> Global frequency set at 1000000000000 ticks per second
> Traceback (most recent call last):
>   File "<string>", line 1, in ?
>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/main.py", line
> 359, in main
>     exec filecode in scope
>   File "configs/example/se.py", line 179, in ?
>     Simulation.run(options, root, system, FutureClass)
>   File "/afs/
> crc.nd.edu/user/s/sli2/m5-work-stable/configs/common/Simulation.py", line
> 236, in run
>     m5.instantiate(checkpoint_dir)
>   File "/afs/
> crc.nd.edu/user/s/sli2/m5-work-stable/src/python/m5/simulate.py", line 77,
> in instantiate
>     for obj in root.descendants(): obj.createCCObject()
>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
> line 841, in createCCObject
>     def createCCObject(self):
>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
> line 796, in getCCParams
>     value = value.getValue()
>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
> line 845, in getValue
>     def getValue(self):
>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
> line 826, in getCCObject
>     self._ccObject = -1
>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
> line 796, in getCCParams
>     value = value.getValue()
>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/params.py",
> line 183, in getValue
>     return [ v.getValue() for v in self ]
>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
> line 845, in getValue
>     def getValue(self):
>   File "/afs/crc.nd.edu/user/s/sli2/m5-stable/src/python/m5/SimObject.py",
> line 822, in getCCObject
>     #print self.name
> RuntimeError: Attempt to instantiate orphan node
>
> Thanks a lot!
> -Sheng
>
>
>
> On Wed, Feb 9, 2011 at 4:03 PM, Sheng Li <[email protected]> wrote:
>
>> Thanks Joel!
>>
>> Yes, I did. The checkpoint created with 4096MB has problem as lots of
>> information is missing. Is it possible that checkpoint does not support
>> larger memory (i.e 4096MB) in M5?
>>
>> Thanks
>> -Sheng
>>
>>
>>
>>
>> On Wed, Feb 9, 2011 at 3:31 PM, Joel Hestness <[email protected]>wrote:
>>
>>> Hi Sheng,
>>>   Did you collect the checkpoints from a simulated system with 512MB of
>>> memory?  The checkpoints encode the current state of memory in the simulated
>>> system including the capacity, so you'll need to make sure that the
>>> simulated system in both runs (to collect the checkpoint and to restore from
>>> it) use the same amount of simulated memory.
>>>   More generally, an M5 checkpoint is specific to the ISA/architecture,
>>> number of cores, and the capacity of memory in the simulated system that you
>>> collect the checkpoint from.
>>>   Hope this helps,
>>>   Joel
>>>
>>>
>>> On Wed, Feb 9, 2011 at 12:41 PM, Sheng Li <[email protected]> wrote:
>>>
>>>> After spending several hours to guess what was wrong, here are my
>>>> findings:
>>>>
>>>> It seems that if I set PhysicalMemory as 512MB, checkpointing can work.
>>>> However, if I set  it as 4096MB (I did this because SPECCPU2006 requires at
>>>> least 2GB free memory), checkpoint will not work. The place I changed this
>>>> is in common/example/se.py
>>>>
>>>> system = System(cpu = [CPUClass(cpu_id=i) for i in xrange(np)],
>>>>                 physmem = PhysicalMemory(range=AddrRange("4096MB")),
>>>>                 membus = Bus(), mem_mode = test_mem_mode)
>>>>
>>>> Could anyone give some suggestions?
>>>>
>>>> Thanks!
>>>> -Sheng
>>>>
>>>>
>>>>
>>>> On Wed, Feb 9, 2011 at 12:05 AM, Sheng Li <[email protected]> wrote:
>>>>
>>>>> Hi Guys,
>>>>>
>>>>> I tried to use checkpoints in M5 but could not have it work. I used
>>>>> ALPHA_SE.
>>>>>
>>>>> The commands I use to create/resume checkpoints are M5 outputs are:
>>>>>
>>>>> Creating checkpoint:
>>>>> ______________________
>>>>> [sli2@newcell ~/m5-work-stable]$ ./build/ALPHA_SE/m5.opt
>>>>> configs/example/se.py --bench bzip2 --take-checkpoint=2200 
>>>>> --at-instruction
>>>>> ...
>>>>> command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench
>>>>> bzip2 --take-checkpoint=2200 --at-instruction
>>>>> 2200000000
>>>>> Global frequency set at 1000000000000 ticks per second
>>>>> 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
>>>>> Creating checkpoint at inst:2200
>>>>> info: Entering event queue @ 0.  Starting simulation...
>>>>> info: Increasing stack size by one page.
>>>>> hack: be nice to actually delete the event here
>>>>> exit cause = a thread reached the max instruction count
>>>>> Writing checkpoint
>>>>> Checkpoint written.
>>>>> Exiting @ cycle 1111000 because a thread reached the max instruction
>>>>> count
>>>>>
>>>>> Resume checkpoint:
>>>>> _________________________
>>>>> command line: ./build/ALPHA_SE/m5.opt configs/example/se.py --bench
>>>>> bzip2 --checkpoint-restore=2200 --at-instruction
>>>>> 2200000000
>>>>> Global frequency set at 1000000000000 ticks per second
>>>>> 0: system.remote_gdb.listener: listening for remote gdb #0 on port 7000
>>>>> warn: optional parameter system.cpu.workload:M5_pid not present
>>>>> For more information see: http://www.m5sim.org/warn/aa78cda1
>>>>> **** REAL SIMULATION ****
>>>>> info: Entering event queue @ 1111000.  Starting simulation...
>>>>> hack: be nice to actually delete the event here
>>>>> Exiting @ cycle 1111500 because halt instruction encountered <--Here
>>>>> is the problem.
>>>>>
>>>>> Any help would be highly appreciated!
>>>>>
>>>>> Thanks
>>>>> -Sheng
>>>>>
>>>>>
>>>>>
>>>>
>>>> _______________________________________________
>>>> m5-users mailing list
>>>> [email protected]
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>>
>>>
>>>
>>>
>>> --
>>>   Joel Hestness
>>>   PhD Student, Computer Architecture
>>>   Dept. of Computer Science, University of Texas - Austin
>>>   http://www.cs.utexas.edu/~hestness
>>>
>>> _______________________________________________
>>> m5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>>>
>>
>>
>
> _______________________________________________
> m5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/m5-users
>



-- 
  Joel Hestness
  PhD Student, Computer Architecture
  Dept. of Computer Science, University of Texas - Austin
  http://www.cs.utexas.edu/~hestness
_______________________________________________
m5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/m5-users

Reply via email to