Hi, Mohammad

The file boot.easy.ckpt.rcS attached is the rcS script that you use for
taking checkpoint, which is the same as the one in dist-gem5 official
website.

I have tested to ping from the node 0 to the node 1 before taking
checkpoint, but failed.
The script of ping is the file boot.easy.ping attached. The output files
are log.0, log.1 attached.
All the content of *m5out.0/testsys.terminal* of the node 0 sending the
command ping is as follow:

Loading new script...

start apache bench

Hello from 0 of 2

PING 192.168.0.3 (192.168.0.3) 56(84) bytes of data.

AH00557: apache2: apr_sockaddr_info_get() failed for node0

AH00558: apache2: Could not reliably determine the server's fully qualified
> domain name, using 127.0.0.1. Set the 'ServerName' directive globally to
> suppress this message


All the content of *m5out.1/testsys.terminal* of the node 1 receiving the
command ping is as follow:

start apache bench

Hello from 1 of 2

connect: Network is unreachable


Is it possible reason I failed that I used the wrong KERNEL or DTB?
My ckpt.sh file is as follow.

export M5_PATH=~/dist-gem5/gem5

GEM5_DIR=~/dist-gem5/gem5

RUNDIR=$(pwd)/rundir.ping



> #IMG=$M5_PATH/disks/aarch64-ubuntu-trusty-headless.img

IMG=$M5_PATH/disks/memcached.apache.mysql.aarch64.img

KERNEL=$M5_PATH/binaries/vmlinux.aarch64.20140821

DTB=$M5_PATH/binaries/vexpress.aarch64.20140821.dtb



> FS_CONFIG=$GEM5_DIR/configs/example/fs.py

SW_CONFIG=$GEM5_DIR/configs/dist/sw.py

GEM5_EXE=$GEM5_DIR/build/ARM/gem5.opt



> BOOT_SCRIPT=$GEM5_DIR/util/dist/apache.heterogineous/boot.easy.ping.rcS

GEM5_DIST_SH=$GEM5_DIR/util/dist/gem5-dist.heterogineous.sh

DEBUG_FLAGS="--debug-flags=DistEthernet"



> #CHKPT_RESTORE="-r1"

NNODES=2



> $GEM5_DIST_SH -n $NNODES
>   \

              -x $GEM5_EXE
>   \

     -r $RUNDIR                                                     \

     -s $SW_CONFIG                                                  \

              -f $FS_CONFIG
>  \

              --m5-args
>  \

                 $DEBUG_FLAGS
>  \

              --fs-args
>  \

                  --cpu-type=AtomicSimpleCPU             \

 --num-cpus=1                                              \

                  --machine-type=VExpress_EMM64
>  \

                  --disk-image=$IMG
>  \

                  --kernel=$KERNEL
>   \

                  --dtb-filename=$DTB
>  \

                  --script=$BOOT_SCRIPT
>  \

              --node0-args
>   \

         --num-cpus=2


Looking forward to your reply.


Best Regards,
Boyang Xu

A graduate student in UVIC

On Thu, Mar 29, 2018 at 7:14 PM, Mohammad Alian <[email protected]>
wrote:

> Can you post the rcS script that you use for taking checkpoint? Can you
> ping the other node before taking checkpoint?
>
> On Thu, Mar 29, 2018 at 6:41 PM, Boyang Xu <[email protected]> wrote:
>
>> Hi, Mohammad
>> The exact problem is to fail to run apache bench in the above
>> configuration. The attachments are the output files and input files.
>> BTY, is it possible to create a checkpoint with Android disk image by
>> dist-gem5? is there the special requirement of Android disk image`s version?
>> Looking forward to your reply.
>>
>> Best Regards,
>> Boyang Xu
>>
>> A graduate student in UVIC
>>
>> On Thu, Mar 29, 2018 at 2:10 PM, Mohammad Alian <[email protected]>
>> wrote:
>>
>>> I see that both nodes write a checkpoint. What is the problem exactly?
>>>
>>> Best,
>>> Mohammad
>>>
>>>
>>> On Wed, Mar 28, 2018 at 9:22 PM, Boyang Xu <[email protected]> wrote:
>>>
>>>> Hi all,
>>>>
>>>> Although I followed the tutorial “iiswc17-tutorial-final-dist-gem5” to
>>>> model a  heterogeneous cluster, I failed because of the failure to create a
>>>> checkpoint. There are two modes in the heterogeneous cluster. The node 0
>>>> has two CPUs while the node 1 has one CPU. I think the reason is that the
>>>> whole dist-gem5 process is over after the node 0 finishes to create a
>>>> checkpoint while the node 1 does not finish the initialization and creating
>>>> a checkpoint, because the executing speed of node 0 gem5 process with two
>>>> CPUs is faster than node 1 gem5 process with one CPUs. The node 1 does not
>>>> finish the checkpoint actually.
>>>>
>>>> Due to solve it, I added a command “sleep 5” before or after the
>>>> command “/sbin/m5 checkpoint 1” in the file boot.easy.ckpt.rcS but failed.
>>>> The attachments is my scripts and output files.
>>>>
>>>> Looking forward to your reply.
>>>> Best Regards,
>>>> Boyang Xu
>>>>
>>>> A graduate student in UVIC
>>>>
>>>> _______________________________________________
>>>> gem5-users mailing list
>>>> [email protected]
>>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>>
>>>
>>>
>>> _______________________________________________
>>> gem5-users mailing list
>>> [email protected]
>>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>>
>>
>>
>> _______________________________________________
>> gem5-users mailing list
>> [email protected]
>> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>>
>
>
> _______________________________________________
> gem5-users mailing list
> [email protected]
> http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users
>

Attachment: boot.easy.ckpt.rcS
Description: Binary data

Attachment: boot.easy.ping.rcS
Description: Binary data

Attachment: log.0
Description: Binary data

Attachment: log.1
Description: Binary data

_______________________________________________
gem5-users mailing list
[email protected]
http://m5sim.org/cgi-bin/mailman/listinfo/gem5-users

Reply via email to