Re: [OMPI users] Fault tolerant feature in Open MPI

2016-04-04 Thread Xavier Besseron
Hi Husen, Sorry for this late reply. I gave a quick try at FTB and I managed to get it to work on my local machine. I just had to apply this patch to prevent the agent to crash. Maybe this was your issue: https://github.com/besserox/ftb/commit/01aa44f5ed34e35429ddf99084395e4e8ba67b7c Here is a (v

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-20 Thread Husen R
Dear Xavier, Yes, I did. I followed the instructions available in that file, especially at sub-section 4.1. I configured boot-strap IP using the ./configure options. in front-end node, the boot-strap IP is its IP address because I want to make it as an ftb_database_server. in every compute nodes,

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-19 Thread Xavier Besseron
Dear Husen, Did you check the information in file ./docs/chapters/01_FTB_on_Linux.txt inside the ftb tarball? You might want to look at sub-section 4.1. You can also try to get support on this via the MVAPICH2 mailing list. Best regards, Xavier On Fri, Mar 18, 2016 at 11:24 AM, Husen R wrot

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-18 Thread Ralph Castain
I don’t believe OMPI supports FTB, I’m afraid - you might want to post your question on an FTB mailing list (I don’t recall if that project is even active any more?) > On Mar 18, 2016, at 3:24 AM, Husen R wrote: > > Dear all, > > Thanks for the reply and valuable informations. > > I have c

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-18 Thread Husen R
Dear all, Thanks for the reply and valuable informations. I have configured MVAPICH2 using the instructions available in a resource provided by Xavier. I also have installed FTB (Fault-Tolerant Backplane) in order for MVAPICH2 to have process migration feature. however, I got the following error

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-17 Thread Xavier Besseron
On Thu, Mar 17, 2016 at 3:17 PM, Ralph Castain wrote: > Just to clarify: I am not aware of any MPI that will allow you to relocate a > process while it is running. You have to checkpoint the job, terminate it, > and then restart the entire thing with the desired process on the new node. > Dear a

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-17 Thread Bland, Wesley
Presumably Adaptive MPI would allow you to do that. I don’t know all the details of how that works there though. From: users on behalf of Ralph Castain Reply-To: Open MPI Users Date: Thursday, March 17, 2016 at 9:17 AM To: Open MPI Users Subject: Re: [OMPI users] Fault tolerant feature in

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-17 Thread Ralph Castain
Just to clarify: I am not aware of any MPI that will allow you to relocate a process while it is running. You have to checkpoint the job, terminate it, and then restart the entire thing with the desired process on the new node. > On Mar 16, 2016, at 3:15 AM, Husen R wrote: > > In the case of

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-17 Thread Dave Love
Husen R writes: > Dear Open MPI Users, > > > Does the current stable release of Open MPI (v1.10 series) support fault > tolerant feature ? > I got the information from Open MPI FAQ that The checkpoint/restart support > was last released as part of the v1.6 series. > I just want to make sure about

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-16 Thread Husen R
In the case of MPI application (not gromacs), How do I relocate MPI application from one node to another node while it is running ? I'm sorry, as far as I know the *ompi-restart *command is used to restart application, based on checkpoint file, once the application already terminated (no longer run

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-16 Thread Jeff Hammond
Just checkpoint-restart the app to relocate. The overhead will be lower than trying to do with MPI. Jeff On Wednesday, March 16, 2016, Husen R wrote: > Hi Jeff, > > Thanks for the reply. > > After consulting the Gromacs docs, as you suggested, Gromacs already > supports checkpoint/restart. than

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-16 Thread Husen R
Hi Jeff, Thanks for the reply. After consulting the Gromacs docs, as you suggested, Gromacs already supports checkpoint/restart. thanks for the suggestion. Previously, I asked about checkpoint/restart in Open MPI because I want to checkpoint MPI Application and restart/migrate it while it is run

Re: [OMPI users] Fault tolerant feature in Open MPI

2016-03-16 Thread Jeff Hammond
Why do you need OpenMPI to do this? Molecular dynamics trajectories are trivial to checkpoint and restart at the application level. I'm sure Gromacs already supports this. Please consult the Gromacs docs or user support for details. Jeff On Tuesday, March 15, 2016, Husen R wrote: > Dear Open MP

[OMPI users] Fault tolerant feature in Open MPI

2016-03-16 Thread Husen R
Dear Open MPI Users, Does the current stable release of Open MPI (v1.10 series) support fault tolerant feature ? I got the information from Open MPI FAQ that The checkpoint/restart support was last released as part of the v1.6 series. I just want to make sure about this. and by the way, does Ope