Boyang,

I don't see any problem with sleeping for a few seconds before dumping a
checkpoint. If it didn't work then sleep longer before dumping a checkpoint.

Best,
Mohammad

On Tue, Apr 3, 2018 at 12:25 PM, Boyang Xu <[email protected]> wrote:

> Hi Mohammad,
>
> Thanks for your reply. I am thinking if there are some functions detecting
> the initialization status automatically. This function runs in the back end
> of gem5 process, so I can create a checkpoint in a heterogeneous cluster
> using the original boot script.
>
> Best Regards,
> Boyang Xu
>
> A graduate student in UVIC
>
> 2018-04-02 19:14 GMT-07:00 Mohammad Alian <[email protected]>:
>
>> Hi Boyang,
>>
>> Thanks for the update. I guess you need to wait for a few seconds for the
>> networking service to start in the simulated system (you cannot use the
>> networking service immediately after the boot up).
>>
>> Then it should also work If you modify the original boot script like
>> bellow:
>>
>> if [ "$MY_RANK" == "0" ]
>> then
>>       sleep 2
>>       /sbin/m5 checkpoint 1
>> else
>>       sleep 2
>> fi
>>
>>
>> Best,
>> Mohammad
>>
>>
>> On Mon, Apr 2, 2018 at 7:42 PM, Boyang Xu <[email protected]> wrote:
>>
>>> Hi, Mohammad
>>>
>>> I solved the problem. I modified the file *boot.easy.ckpt.sh
>>> <http://boot.easy.ckpt.sh>* as the table 1.a), and then ping to the
>>> other node before taking a checkpoint and create a checkpoint successfully.
>>> I am wodering if there is a better way to create a checkpoint in
>>> a heterogeneous cluster by dist-gem5.
>>>
>>> Table 1. a) modified part; b) orignal part
>>>
>>> 48 if [ "$MY_RANK" == "0" ]
>>> 49 then
>>> 50    sleep 2
>>> 51    ping -c 1 192.168.0.3
>>> 52    sleep 1
>>> 53 else
>>> 54    ping -c 1 192.168.0.2
>>> 55    /sbin/m5 checkpoint 1:
>>> 56 fi
>>>
>>> 48 if [ "$MY_RANK" == "0" ]
>>>
>>> 49 then
>>>
>>> 50     /sbin/m5 checkpoint 1
>>>
>>> 51 else
>>>
>>> 52     sleep 0.01
>>>
>>> 53 fi
>>>
>>>
>>>
>>>
>>>
>>> Best Regards,
>>> Boyang Xu
>>>
>>> A graduate student in UVIC
>>>
>>> On Thu, Mar 29, 2018 at 7:14 PM, Mohammad Alian <[email protected]>
>>> wrote:
>>>
>>>> Can you post the rcS script that you use for taking checkpoint? Can you
>>>> ping the other node before taking checkpoint?
>>>>
>>>> On Thu, Mar 29, 2018 at 6:41 PM, Boyang Xu <[email protected]> wrote:
>>>>
>>>>> Hi, Mohammad
>>>>> The exact problem is to fail to run apache bench in the above
>>>>> configuration. The attachments are the output files and input files.
>>>>> BTY, is it possible to create a checkpoint with Android disk image by
>>>>> dist-gem5? is there the special requirement of Android disk image`s 
>>>>> version?
>>>>> Looking forward to your reply.
>>>>>
>>>>> Best Regards,
>>>>> Boyang Xu
>>>>>
>>>>> A graduate student in UVIC
>>>>>
>>>>> On Thu, Mar 29, 2018 at 2:10 PM, Mohammad Alian <[email protected]
>>>>> > wrote:
>>>>>
>>>>>> I see that both nodes write a checkpoint. What is the problem
>>>>>> exactly?
>>>>>>
>>>>>> Best,
>>>>>> Mohammad
>>>>>>
>>>>>>
>>>>>> On Wed, Mar 28, 2018 at 9:22 PM, Boyang Xu <[email protected]> wrote:
>>>>>>
>>>>>>> Hi all,
>>>>>>>
>>>>>>> Although I followed the tutorial “iiswc17-tutorial-final-dist-gem5”
>>>>>>> to model a  heterogeneous cluster, I failed because of the failure to
>>>>>>> create a checkpoint. There are two modes in the heterogeneous cluster. 
>>>>>>> The
>>>>>>> node 0 has two CPUs while the node 1 has one CPU. I think the reason is
>>>>>>> that the whole dist-gem5 process is over after the node 0 finishes to
>>>>>>> create a checkpoint while the node 1 does not finish the initialization 
>>>>>>> and
>>>>>>> creating a checkpoint, because the executing speed of node 0 gem5 
>>>>>>> process
>>>>>>> with two CPUs is faster than node 1 gem5 process with one CPUs. The 
>>>>>>> node 1
>>>>>>> does not finish the checkpoint actually.
>>>>>>>
>>>>>>> Due to solve it, I added a command “sleep 5” before or after the
>>>>>>> command “/sbin/m5 checkpoint 1” in the file boot.easy.ckpt.rcS but 
>>>>>>> failed.
>>>>>>> The attachments is my scripts and output files.
>>>>>>>
>>>>>>> Looking forward to your reply.
>>>>>>> Best Regards,
>>>>>>> Boyang Xu
>>>>>>>
>>>>>>> A graduate student in UVIC
>>>>>>>
>>>>>>> _______________________________________________
>>>>>>> gem5-users mailing list
>>>>>>> [email protected]
>>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>>>
>>>>>>
>>>>>>
>>>>>> _______________________________________________
>>>>>> gem5-users mailing list
>>>>>> [email protected]
>>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> gem5-users mailing list
>>>>> [email protected]
>>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> gem5-users mailing list
>>>> [email protected]
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>
>>>
>>>
>>> _______________________________________________
>>> gem5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>
>>
>> _______________________________________________
>> gem5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>
> _______________________________________________
> gem5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>
_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to