Hi, Mohammad I solved the problem. I modified the file *boot.easy.ckpt.sh <http://boot.easy.ckpt.sh>* as the table 1.a), and then ping to the other node before taking a checkpoint and create a checkpoint successfully. I am wodering if there is a better way to create a checkpoint in a heterogeneous cluster by dist-gem5.
Table 1. a) modified part; b) orignal part 48 if [ "$MY_RANK" == "0" ] 49 then 50 sleep 2 51 ping -c 1 192.168.0.3 52 sleep 1 53 else 54 ping -c 1 192.168.0.2 55 /sbin/m5 checkpoint 1: 56 fi 48 if [ "$MY_RANK" == "0" ] 49 then 50 /sbin/m5 checkpoint 1 51 else 52 sleep 0.01 53 fi Best Regards, Boyang Xu A graduate student in UVIC On Thu, Mar 29, 2018 at 7:14 PM, Mohammad Alian <[email protected]> wrote: > Can you post the rcS script that you use for taking checkpoint? Can you > ping the other node before taking checkpoint? > > On Thu, Mar 29, 2018 at 6:41 PM, Boyang Xu <[email protected]> wrote: > >> Hi, Mohammad >> The exact problem is to fail to run apache bench in the above >> configuration. The attachments are the output files and input files. >> BTY, is it possible to create a checkpoint with Android disk image by >> dist-gem5? is there the special requirement of Android disk image`s version? >> Looking forward to your reply. >> >> Best Regards, >> Boyang Xu >> >> A graduate student in UVIC >> >> On Thu, Mar 29, 2018 at 2:10 PM, Mohammad Alian <[email protected]> >> wrote: >> >>> I see that both nodes write a checkpoint. What is the problem exactly? >>> >>> Best, >>> Mohammad >>> >>> >>> On Wed, Mar 28, 2018 at 9:22 PM, Boyang Xu <[email protected]> wrote: >>> >>>> Hi all, >>>> >>>> Although I followed the tutorial “iiswc17-tutorial-final-dist-gem5” to >>>> model a heterogeneous cluster, I failed because of the failure to create a >>>> checkpoint. There are two modes in the heterogeneous cluster. The node 0 >>>> has two CPUs while the node 1 has one CPU. I think the reason is that the >>>> whole dist-gem5 process is over after the node 0 finishes to create a >>>> checkpoint while the node 1 does not finish the initialization and creating >>>> a checkpoint, because the executing speed of node 0 gem5 process with two >>>> CPUs is faster than node 1 gem5 process with one CPUs. The node 1 does not >>>> finish the checkpoint actually. >>>> >>>> Due to solve it, I added a command “sleep 5” before or after the >>>> command “/sbin/m5 checkpoint 1” in the file boot.easy.ckpt.rcS but failed. >>>> The attachments is my scripts and output files. >>>> >>>> Looking forward to your reply. >>>> Best Regards, >>>> Boyang Xu >>>> >>>> A graduate student in UVIC >>>> >>>> _______________________________________________ >>>> gem5-users mailing list >>>> [email protected] >>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>>> >>> >>> >>> _______________________________________________ >>> gem5-users mailing list >>> [email protected] >>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >>> >> >> >> _______________________________________________ >> gem5-users mailing list >> [email protected] >> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >> > > > _______________________________________________ > gem5-users mailing list > [email protected] > http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users >
_______________________________________________ gem5-users mailing list [email protected] http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
