Hi Haiyang,
Checkpointing and Ruby hasn't been tested in a while, as far as I know. I
guess what I mean by that is that I haven't used the feature in a long
time, and I don't know of anyone that tracks mainline gem5 that does use
Ruby and checkpointing :).
When taking a checkpoint, you have to use a protocol that implements the
"FLUSH" command. I believe that only MOESI_hammer implements this command.
I could be wrong, though.
Can you checkpoint with MOESI_hammer and restore with MOESI_hammer?
If you're getting deadlocks during restore it's probably a real deadlock.
FIFO ordering violations also seem like something you can't safely ignore.
Are you using the same topology for checkpoint and restore? Ruby uses a
order-specific list to save and restore the controllers. So, if you restore
with a different number of directories, it's possible this perturbs the
controller list so that the L1 caches are in a different place in the list.
This might break restore. It's not very robust.
I would suggest using some debug flags to try to track down the problem
(RubyCacheTrace is the flag you want to print info while restoring, also
possibly ProtocolTrace). You've likely found a bug. If you can track it
down and post a fix to gerrit, we'd appreciate it!
Let me know if you run into any other questions.
Jason
On Fri, Mar 16, 2018 at 1:45 PM Haiyang Han <
haiyang@eecs.northwestern.edu> wrote:
> Hi all,
>
> I'm trying to create and restore checkpoints with ruby while simulating a
> 16-core O3CPU, full system, x86 configuration. I can create the checkpoints
> with no problem, but a little while after restoring from the checkpoints, I
> am seeing all sorts of gem5 aborts due to panics. Sometime it complains
> about a possible deadlock, other times it complains that FIFO ordering is
> violated. Below is the ruby protocols I tried:
>
> *Protocol used to write checkpoint: Protocol used to restore:*
> MOESI_hammer MESI_Two_level
> MOESI_hammer MESI_Three_level
> MESI_Two_level MESI_Two_level
> MESI_Three_level MESI_Three_level
>
> I read on http://gem5.org/Checkpoints that only MOESI_hammer supports the
> writing of checkpoints. Does this still apply to the newest gem5 versions?
> Could it be that the traffic generated by the 16 cores is too much for the
> ruby system to handle correctly? Is it possible to solve the deadlock issue
> by manually increasing a threshold somewhere in the source code? What about
> the FIFO ordering violation? It'll be great if any of these are answered :D
>
> Thanks!
> Haiyang
> ___
> gem5-users mailing list
> gem5-users@gem5.org
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
___
gem5-users mailing list
gem5-users@gem5.org
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users