Hi,

I noticed that there are some issues with the current SPARC codebase that
create problems when checkpointing in FS mode.

When running the following:
build/SPARC/gem5.opt configs/example/fs.py --checkpoint-at-end

The m5.cpt file is either missing parameters or writing them incorrectly,
which in return throws fatal errors during the unserialization phase of
loading the checkpoint. Identifying which parameters were missing or
incorrect allowed me to manually modify m5.cpt and successfully load the
checkpoint from its previous state.

Here is the complete list of problematic parameters when running the
following:
build/SPARC/gem5.opt configs/example/fs.py -r 1

fatal: Can't unserialize 'system.cpu.dtb:sfar' - missing in file
fatal: Can't unserialize 'system.cpu.isa:pstate' -  incorrect
format: (uint16_t)pstate=22
fatal: Can't unserialize 'system.cpu.isa:hpstate' - incorrect
format: (uint64_t)hpstate=2048
fatal: Can't unserialize 'system.cpu.itb:sfar' - missing in file
fatal: Can't unserialize 'system.disk0:currPwrState' - missing in file*
fatal: Can't unserialize 'system.disk0:prvEvalTick' - missing in file*

I have some proposed changes that on my end resolve these issues and
correctly generate these parameters in the m5.cpt file, and most of them
are small fixes. However, with regards to system.disk0, since it inherits
from ClockedObject, the default unserialize() function looks for
currPwrState and prvEvalTick. system.disk0 is a MmDisk object, therefore it
overrides the initial ClockedObject serialize() function. My current
solution is to override MmDisk's unserialize() as well and simply leave it
empty, but I am not sure if that is the right thing to do. I can send in a
code review with all of my proposed changes, but since this is my first
time contributing to a project of this scope, I apologize in advance if I
am missing a few steps.

Additonally, even without checkpointing, there appears to be a bug present
in FS mode where the simulation will exit because the number of ticks
suddenly reaches its limit of 2^64. It is hard for me to recreate this bug
because it is not deterministic, and when I use gdb and gem5.debug, I have
not encountered this bug yet.

Finally, I have tested a variety of applications on the OpenSPARC T1
Solaris 10 disk image, and almost all of them work out of the box except
for Water-spatial from SPLASH-2, which gives me an arithmetic exception
error. I even modified the m5 binary to allow for checkpointing calls
within the simulator itself. Overall, I am thankful for the simulator
infrastructure despite its current state. I am considering documenting all
of the necessary steps to use SPARC's FS simulator in case someone else may
wish to use it in the future.

-- 
Thanks,
Khalique Ahmed
_______________________________________________
gem5-dev mailing list
gem5-dev@gem5.org
http://m5sim.org/mailman/listinfo/gem5-dev

Reply via email to