OK,
Thank you for your help.

I'll will do it.

But, is it still possible to restart a mpi program previously checkpoint,
on another different nodes ?

And also, i don't know how to look at the core file from segfault to send
you the backtrace. If you can guide me.

Thank you.

Hope to read you.

Best regards.

2015-04-15 4:19 GMT+01:00 Ralph Castain <r...@open-mpi.org>:

> I’m afraid I don’t know much about that code, but I can try to help. Can
> you look at the core file from the segfault and send the backtrace?
>
>
> On Apr 14, 2015, at 8:04 AM, Cyrille DIBAMOU MBEUYO <cdiba...@gmail.com>
> wrote:
>
> Thank you for your reply.
>
> This error appear on the master node, where i launch the "ompi-restart"
> command.
>
>  It is on the 1.6.5 serie
>
> 2015-04-14 15:21 GMT+01:00 Ralph Castain <r...@open-mpi.org>:
>
>> Is this on the master? Or on the 1.8 series?
>>
>>
>> On Apr 14, 2015, at 2:12 AM, Cyrille DIBAMOU MBEUYO <cdiba...@gmail.com>
>> wrote:
>>
>> *Good morning,*
>>
>> In my experiences, i have shared a folder in my cluster of 4 computers.
>> I then run a mpi programm (which calculate the nomber pi) on 2 computers,
>> and i checkpoint it with the ompi-checkpoint command.
>> For the restart, i modify the nodes file for change one computer name so
>> that the two new computer node are different than the two older for the
>> computing.
>> With the ompi-restart command (and the nodes file), when i restart the
>> checkpointed application, i got this error message : *mpirun noticed
>> that process rank 1 with PID 1619 on node oartwo exited on signal 11
>> (segmentation fault).*
>>
>> I then want to kno the problem and how to solve it.
>>
>>
>>
>> *Thank you*
>>
>>
>> *Hope to read you.*
>> *Best regards.*
>>
>> --
>> DIBAMOU MBEUYO Cyrille
>> Computer Engineer, M.Sc.
>> Ph.D. Student in Computer Science
>> *Mobile* : (+237) 696 608 826 / 674 979 502
>> The University Of Ngaoundere,  CAMEROUN
>> *Other Email *: cdiba...@univ-ndere.cm
>>
>>  _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/04/17217.php
>>
>>
>>
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> Link to this post:
>> http://www.open-mpi.org/community/lists/devel/2015/04/17218.php
>>
>
>
>
> --
> DIBAMOU MBEUYO Cyrille
> Computer Engineer, M.Sc.
> Ph.D. Student in Computer Science
> *Mobile* : (+237) 696 608 826 / 674 979 502
> The University Of Ngaoundere,  CAMEROUN
> *Other Email *: cdiba...@univ-ndere.cm
>
>  _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/04/17219.php
>
>
>
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/devel
> Link to this post:
> http://www.open-mpi.org/community/lists/devel/2015/04/17221.php
>



-- 
DIBAMOU MBEUYO Cyrille
Computer Engineer, M.Sc.
Ph.D. Student in Computer Science
*Mobile* : (+237) 696 608 826 / 674 979 502
The University Of Ngaoundere,  CAMEROUN
*Other Email *: cdiba...@univ-ndere.cm

Reply via email to