Re: [OMPI users] Related to project ideas in OpenMPI

2011-08-27 Thread Joshua Hursey
,
>>>>> 
>>>>> There's also Kernel-Level Checkpointing vs. User-Level Checkpointing -
>>>>> if you can checkpoint an MPI task and restart it on a new node, then
>>>>> this is also "process migration".
>>>>> 
>>>>> Of course, doing a checkpoint & restart can be slower than pure
>>>>> in-kernel process migration, but the advantage is that you don't need
>>>>> any kernel support, and can in fact do all of it in user-space.
>>>>> 
>>>>> Rayson
>>>>> 
>>>>> 
>>>>> On Thu, Aug 25, 2011 at 10:26 AM, Ralph Castain <r...@open-mpi.org> wrote:
>>>>>> It also depends on what part of migration interests you - are you 
>>>>>> wanting to look at the MPI part of the problem (reconnecting MPI 
>>>>>> transports, ensuring messages are not lost, etc.) or the RTE part of the 
>>>>>> problem (where to restart processes, detecting failures, etc.)?
>>>>>> 
>>>>>> 
>>>>>> On Aug 24, 2011, at 7:04 AM, Jeff Squyres wrote:
>>>>>> 
>>>>>>> Be aware that process migration is a pretty complex issue.
>>>>>>> 
>>>>>>> Josh is probably the best one to answer your question directly, but 
>>>>>>> he's out today.
>>>>>>> 
>>>>>>> 
>>>>>>> On Aug 24, 2011, at 5:45 AM, srinivas kundaram wrote:
>>>>>>> 
>>>>>>>> I am final year grad student looking for my final year project in 
>>>>>>>> OpenMPI.We are group of 4 students.
>>>>>>>> I wanted to know about the "Process Migration" process of MPI 
>>>>>>>> processes in OpenMPI.
>>>>>>>> Can anyone suggest me any ideas for project related to process 
>>>>>>>> migration in OenMPI or other topics in Systems.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> regards,
>>>>>>>> Srinivas Kundaram
>>>>>>>> srinu1...@gmail.com
>>>>>>>> +91-8149399160
>>>>>>>> ___
>>>>>>>> users mailing list
>>>>>>>> us...@open-mpi.org
>>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>>> 
>>>>>>> 
>>>>>>> --
>>>>>>> Jeff Squyres
>>>>>>> jsquy...@cisco.com
>>>>>>> For corporate legal information go to:
>>>>>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>>>>>> 
>>>>>>> 
>>>>>>> ___
>>>>>>> users mailing list
>>>>>>> us...@open-mpi.org
>>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>>> 
>>>>>> ___
>>>>>> users mailing list
>>>>>> us...@open-mpi.org
>>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Rayson
>>>>> 
>>>>> ==
>>>>> Open Grid Scheduler - The Official Open Source Grid Engine
>>>>> http://gridscheduler.sourceforge.net/
>>>>> 
>>>>> ___
>>>>> users mailing list
>>>>> us...@open-mpi.org
>>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>>> 
>>>> 
>>>> ___
>>>> users mailing list
>>>> us...@open-mpi.org
>>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>> 
>>> 
>>> 
>>> 
>>> --
>>> Rayson
>>> 
>>> ==
>>> Open Grid Scheduler - The Official Open Source Grid Engine
>>> http://gridscheduler.sourceforge.net/
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>> 
>> 
>> 
>> -- 
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 




Re: [OMPI users] BLCR support not building on 1.5.3

2011-05-27 Thread Joshua Hursey
I'm glad that worked.

I understand the confusion. The configure output could be better. It shouldn't 
be too difficult to cleanup. I filed a ticket so we don't forget about this 
issue. The ticket is linked below if you are interested:
  https://svn.open-mpi.org/trac/ompi/ticket/2807

Next time I cycle back to the C/R functionality I'll try to address it, but if 
someone else beats me to it then that should be reflected in the ticket.

-- Josh

On May 27, 2011, at 3:54 PM, Bill Johnstone wrote:

> Hello,
> 
> 
> Thank you very much for this.  I've replied further below:
> 
> 
> - Original Message -
>> From: Joshua Hursey <jjhur...@open-mpi.org>
> [...]
>> What other configure options are you passing to Open MPI? Specifically the 
>> configure test will always fail if '--with-ft=cr' is not specified - by 
>> default Open MPI will only build the BLCR component if C/R FT is requested 
>> by 
>> the user.
> 
> This was it!  Now the BLCR supports builds in just fine.
> 
> If I may offer some feedback:
> 
> When I think "Checkpoint/Restart", I don't immediately think "Fault 
> Tolerance"; rather, I'm interested in it for a better alternative to 
> suspend/resume.  So I had *no* idea turning on the "ft" configure option this 
> was a prerequisite for BLCR support to compile from just reading the 
> configure help, configure output, docs, etc.
> 
> I'd like to request that this be made easier to spot.  At a minimum, the 
> configure -help output could mention this when it gets to talking about BLCR, 
> or C/R in general.
> 
> Additionally, in general when configuring components, it would be nice in the 
> config logs if there was a way to get more details about the tests (and why 
> they failed) than just "can compile...no".  This may require more invasive 
> changes - not being super-knowledgeable about configure, I don't know how 
> much work this would be.
> 
> Lastly, the standard Open MPI documentation (particularly the FAQ) could be 
> updated in the C/R or BLCR sections to reflect the need for the 
> "--with-ft=cr" argument.
> 
> Again, I really appreciate the assistance.
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 




Re: [OMPI users] BLCR support not building on 1.5.3

2011-05-27 Thread Joshua Hursey
What version of BLCR are you using?

What other configure options are you passing to Open MPI? Specifically the 
configure test will always fail if '--with-ft=cr' is not specified - by default 
Open MPI will only build the BLCR component if C/R FT is requested by the user.

Can you send a zip'ed up config.log to the list, that might show something that 
the configure is missing?

Thanks,
Josh

On May 26, 2011, at 2:26 PM, Bill Johnstone wrote:

> Hello all.
> 
> I'm building 1.5.3 from source on a Debian Squeeze AMD64 system, and trying 
> to get BLCR support built-in.  I've installed all the packages that I think 
> should be relevant to BLCR support, including:
> 
> +blcr-dkms
> +libcr0
> +libcr-dev
> +blcr-util
> 
> I've also installed blcr-testuite .  I only run Open MPI's configure after 
> loading the blcr modules, and the tests in blcr-testsuite pass.  The relevant 
> headers seem to be in /usr/include and the relevant libraries in /usr/lib .
> 
> I've tried three different invocations of configure:
> 
> 1. No BLCR-related arguments.
> 
> Output snippet from configure:
> checking --with-blcr value... simple ok (unspecified)
> checking --with-blcr-libdir value... simple ok (unspecified)
> checking if MCA component crs:blcr can compile... no
> 
> 2. With --with-blcr=/usr only
> 
> Output snippet from configure:
> checking --with-blcr value... sanity check ok (/usr)
> checking --with-blcr-libdir value... simple ok (unspecified)
> configure: WARNING: BLCR support requested but not found.  Perhaps you need 
> to specify the location of the BLCR libraries.
> configure: error: Aborting.
> 
> 3. With --with-blcr-libdir=/usr/lib only
> 
> Output snippet from configure:
> checking --with-blcr value... simple ok (unspecified)
> checking --with-blcr-libdir value... sanity check ok (/usr/lib)
> checking if MCA component crs:blcr can compile... no
> 
> 
> config.log only seems to contain the output of whatever tests were run to 
> determine whether or not blcr support could be compiled, but I don't see any 
> way to get details on what code and compile invocation actually failed, in 
> order to get to the root of the problem.  I'm not a configure or m4 expert, 
> so I'm not sure how to go further in troubleshooting this.
> 
> Help would be much appreciated.
> 
> Thanks!
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 




Re: [OMPI users] Unknown overhead in "mpirun -am ft-enable-cr"

2011-03-03 Thread Joshua Hursey
Thanks for the program. I created a ticket for this performance bug and 
attached the tarball to the ticket:
  https://svn.open-mpi.org/trac/ompi/ticket/2743

I do not know exactly when I will be able to get back to this, but hopefully 
soon. I added you to the CC so you should receive any progress updates 
regarding the ticket as we move forward.

Thanks again,
Josh

On Mar 3, 2011, at 2:12 AM, Nguyen Toan wrote:

> Dear Josh,
>  
> Attached with this email is a small program that illustrates the performance 
> problem. You can find simple instructions in the README file.
> There are also 2 sample result files (cpu.256^3.8N.*) which show the 
> execution time difference between 2 cases.
> Hope you can take some time to find the problem.
> Thanks for your kindness.
> 
> Best Regards,
> Nguyen Toan
> 
> On Wed, Mar 2, 2011 at 3:00 AM, Joshua Hursey <jjhur...@open-mpi.org> wrote:
> I have not had the time to look into the performance problem yet, and 
> probably won't for a little while. Can you send me a small program that 
> illustrates the performance problem, and I'll file a bug so we don't lose 
> track of it.
> 
> Thanks,
> Josh
> 
> On Feb 25, 2011, at 1:31 PM, Nguyen Toan wrote:
> 
> > Dear Josh,
> >
> > Did you find out the problem? I still cannot progress anything.
> > Hope to hear some good news from you.
> >
> > Regards,
> > Nguyen Toan
> >
> > On Sun, Feb 13, 2011 at 3:04 PM, Nguyen Toan <nguyentoan1...@gmail.com> 
> > wrote:
> > Hi Josh,
> >
> > I tried the MCA parameter you mentioned but it did not help, the unknown 
> > overhead still exists.
> > Here I attach the output of 'ompi_info', both version 1.5 and 1.5.1.
> > Hope you can find out the problem.
> > Thank you.
> >
> > Regards,
> > Nguyen Toan
> >
> > On Wed, Feb 9, 2011 at 11:08 PM, Joshua Hursey <jjhur...@open-mpi.org> 
> > wrote:
> > It looks like the logic in the configure script is turning on the FT thread 
> > for you when you specify both '--with-ft=cr' and '--enable-mpi-threads'.
> >
> > Can you send me the output of 'ompi_info'? Can you also try the MCA 
> > parameter that I mentioned earlier to see if that changes the performance?
> >
> > I there are many non-blocking sends and receives, there might be 
> > performance bug with the way the point-to-point wrapper is tracking request 
> > objects. If the above MCA parameter does not help the situation, let me 
> > know and I might be able to take a look at this next week.
> >
> > Thanks,
> > Josh
> >
> > On Feb 9, 2011, at 1:40 AM, Nguyen Toan wrote:
> >
> > > Hi Josh,
> > > Thanks for the reply. I did not use the '--enable-ft-thread' option. Here 
> > > is my build options:
> > >
> > > CFLAGS=-g \
> > > ./configure \
> > > --with-ft=cr \
> > > --enable-mpi-threads \
> > > --with-blcr=/home/nguyen/opt/blcr \
> > > --with-blcr-libdir=/home/nguyen/opt/blcr/lib \
> > > --prefix=/home/nguyen/opt/openmpi \
> > > --with-openib \
> > > --enable-mpirun-prefix-by-default
> > >
> > > My application requires lots of communication in every loop, focusing on 
> > > MPI_Isend, MPI_Irecv and MPI_Wait. Also I want to make only one 
> > > checkpoint per application execution for my purpose, but the unknown 
> > > overhead exists even when no checkpoint was taken.
> > >
> > > Do you have any other idea?
> > >
> > > Regards,
> > > Nguyen Toan
> > >
> > >
> > > On Wed, Feb 9, 2011 at 12:41 AM, Joshua Hursey <jjhur...@open-mpi.org> 
> > > wrote:
> > > There are a few reasons why this might be occurring. Did you build with 
> > > the '--enable-ft-thread' option?
> > >
> > > If so, it looks like I didn't move over the thread_sleep_wait adjustment 
> > > from the trunk - the thread was being a bit too aggressive. Try adding 
> > > the following to your command line options, and see if it changes the 
> > > performance.
> > >  "-mca opal_cr_thread_sleep_wait 1000"
> > >
> > > There are other places to look as well depending on how frequently your 
> > > application communicates, how often you checkpoint, process layout, ... 
> > > But usually the aggressive nature of the thread is the main problem.
> > >
> > > Let me know if that helps.
> > >
> > > -- Josh
> > >
> > > On Feb 8, 2011, at 2:50 AM, Nguyen Toan wrote:
> > >
> > > > Hi

Re: [OMPI users] Unknown overhead in "mpirun -am ft-enable-cr"

2011-03-01 Thread Joshua Hursey
I have not had the time to look into the performance problem yet, and probably 
won't for a little while. Can you send me a small program that illustrates the 
performance problem, and I'll file a bug so we don't lose track of it.

Thanks,
Josh

On Feb 25, 2011, at 1:31 PM, Nguyen Toan wrote:

> Dear Josh,
> 
> Did you find out the problem? I still cannot progress anything.
> Hope to hear some good news from you.
> 
> Regards,
> Nguyen Toan
> 
> On Sun, Feb 13, 2011 at 3:04 PM, Nguyen Toan <nguyentoan1...@gmail.com> wrote:
> Hi Josh,
> 
> I tried the MCA parameter you mentioned but it did not help, the unknown 
> overhead still exists.
> Here I attach the output of 'ompi_info', both version 1.5 and 1.5.1.
> Hope you can find out the problem.
> Thank you.
> 
> Regards,
> Nguyen Toan
> 
> On Wed, Feb 9, 2011 at 11:08 PM, Joshua Hursey <jjhur...@open-mpi.org> wrote:
> It looks like the logic in the configure script is turning on the FT thread 
> for you when you specify both '--with-ft=cr' and '--enable-mpi-threads'.
> 
> Can you send me the output of 'ompi_info'? Can you also try the MCA parameter 
> that I mentioned earlier to see if that changes the performance?
> 
> I there are many non-blocking sends and receives, there might be performance 
> bug with the way the point-to-point wrapper is tracking request objects. If 
> the above MCA parameter does not help the situation, let me know and I might 
> be able to take a look at this next week.
> 
> Thanks,
> Josh
> 
> On Feb 9, 2011, at 1:40 AM, Nguyen Toan wrote:
> 
> > Hi Josh,
> > Thanks for the reply. I did not use the '--enable-ft-thread' option. Here 
> > is my build options:
> >
> > CFLAGS=-g \
> > ./configure \
> > --with-ft=cr \
> > --enable-mpi-threads \
> > --with-blcr=/home/nguyen/opt/blcr \
> > --with-blcr-libdir=/home/nguyen/opt/blcr/lib \
> > --prefix=/home/nguyen/opt/openmpi \
> > --with-openib \
> > --enable-mpirun-prefix-by-default
> >
> > My application requires lots of communication in every loop, focusing on 
> > MPI_Isend, MPI_Irecv and MPI_Wait. Also I want to make only one checkpoint 
> > per application execution for my purpose, but the unknown overhead exists 
> > even when no checkpoint was taken.
> >
> > Do you have any other idea?
> >
> > Regards,
> > Nguyen Toan
> >
> >
> > On Wed, Feb 9, 2011 at 12:41 AM, Joshua Hursey <jjhur...@open-mpi.org> 
> > wrote:
> > There are a few reasons why this might be occurring. Did you build with the 
> > '--enable-ft-thread' option?
> >
> > If so, it looks like I didn't move over the thread_sleep_wait adjustment 
> > from the trunk - the thread was being a bit too aggressive. Try adding the 
> > following to your command line options, and see if it changes the 
> > performance.
> >  "-mca opal_cr_thread_sleep_wait 1000"
> >
> > There are other places to look as well depending on how frequently your 
> > application communicates, how often you checkpoint, process layout, ... But 
> > usually the aggressive nature of the thread is the main problem.
> >
> > Let me know if that helps.
> >
> > -- Josh
> >
> > On Feb 8, 2011, at 2:50 AM, Nguyen Toan wrote:
> >
> > > Hi all,
> > >
> > > I am using the latest version of OpenMPI (1.5.1) and BLCR (0.8.2).
> > > I found that when running an application,which uses MPI_Isend, MPI_Irecv 
> > > and MPI_Wait,
> > > enabling C/R, i.e using "-am ft-enable-cr", the application runtime is 
> > > much longer than the normal execution with mpirun (no checkpoint was 
> > > taken).
> > > This overhead becomes larger when the normal execution runtime is longer.
> > > Does anybody have any idea about this overhead, and how to eliminate it?
> > > Thanks.
> > >
> > > Regards,
> > > Nguyen
> > > ___
> > > users mailing list
> > > us...@open-mpi.org
> > > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > 
> > Joshua Hursey
> > Postdoctoral Research Associate
> > Oak Ridge National Laboratory
> > http://users.nccs.gov/~jjhursey
> >
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> >
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




Re: [OMPI users] --without-tm [SEC=UNCLASSIFIED]

2011-02-21 Thread Joshua Hursey
There is no restriction to use the C/R functionality in Open MPI in a TM 
environment (that I am aware of), if you use the ompi-checkpoint/ompi-restart 
commands directly.

If you want TM to checkpoint/restart Open MPI processes for you as part of the 
resource management role, then there is a bit of a work around that you have to 
go through. The 'cr_mpirun' wrapper (mentioned in the email) that BLCR is/will 
be providing does the necessary things to make the two work together. The BLCR 
folks would be the best to contact if there are issues of compatibility when 
using that script since they maintain it.

-- Josh

On Feb 21, 2011, at 9:57 AM, Jeff Squyres wrote:

> On Feb 21, 2011, at 12:50 AM, DOHERTY, Greg wrote:
> 
>> blcr needs cr_mpirun to start the job without torque support to be able
>> to checkpoint the mpi job correctly.
> 
> Josh --
> 
> Do we have a restriction on BLCR support when used with TM?
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




Re: [OMPI users] Unknown overhead in "mpirun -am ft-enable-cr"

2011-02-09 Thread Joshua Hursey
It looks like the logic in the configure script is turning on the FT thread for 
you when you specify both '--with-ft=cr' and '--enable-mpi-threads'. 

Can you send me the output of 'ompi_info'? Can you also try the MCA parameter 
that I mentioned earlier to see if that changes the performance?

I there are many non-blocking sends and receives, there might be performance 
bug with the way the point-to-point wrapper is tracking request objects. If the 
above MCA parameter does not help the situation, let me know and I might be 
able to take a look at this next week.

Thanks,
Josh

On Feb 9, 2011, at 1:40 AM, Nguyen Toan wrote:

> Hi Josh,
> Thanks for the reply. I did not use the '--enable-ft-thread' option. Here is 
> my build options:
> 
> CFLAGS=-g \
> ./configure \
> --with-ft=cr \
> --enable-mpi-threads \
> --with-blcr=/home/nguyen/opt/blcr \
> --with-blcr-libdir=/home/nguyen/opt/blcr/lib \
> --prefix=/home/nguyen/opt/openmpi \
> --with-openib \
> --enable-mpirun-prefix-by-default
> 
> My application requires lots of communication in every loop, focusing on 
> MPI_Isend, MPI_Irecv and MPI_Wait. Also I want to make only one checkpoint 
> per application execution for my purpose, but the unknown overhead exists 
> even when no checkpoint was taken.
> 
> Do you have any other idea?
> 
> Regards,
> Nguyen Toan
> 
> 
> On Wed, Feb 9, 2011 at 12:41 AM, Joshua Hursey <jjhur...@open-mpi.org> wrote:
> There are a few reasons why this might be occurring. Did you build with the 
> '--enable-ft-thread' option?
> 
> If so, it looks like I didn't move over the thread_sleep_wait adjustment from 
> the trunk - the thread was being a bit too aggressive. Try adding the 
> following to your command line options, and see if it changes the performance.
>  "-mca opal_cr_thread_sleep_wait 1000"
> 
> There are other places to look as well depending on how frequently your 
> application communicates, how often you checkpoint, process layout, ... But 
> usually the aggressive nature of the thread is the main problem.
> 
> Let me know if that helps.
> 
> -- Josh
> 
> On Feb 8, 2011, at 2:50 AM, Nguyen Toan wrote:
> 
> > Hi all,
> >
> > I am using the latest version of OpenMPI (1.5.1) and BLCR (0.8.2).
> > I found that when running an application,which uses MPI_Isend, MPI_Irecv 
> > and MPI_Wait,
> > enabling C/R, i.e using "-am ft-enable-cr", the application runtime is much 
> > longer than the normal execution with mpirun (no checkpoint was taken).
> > This overhead becomes larger when the normal execution runtime is longer.
> > Does anybody have any idea about this overhead, and how to eliminate it?
> > Thanks.
> >
> > Regards,
> > Nguyen
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> Joshua Hursey
> Postdoctoral Research Associate
> Oak Ridge National Laboratory
> http://users.nccs.gov/~jjhursey
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




Re: [OMPI users] Unknown overhead in "mpirun -am ft-enable-cr"

2011-02-08 Thread Joshua Hursey
There are a few reasons why this might be occurring. Did you build with the 
'--enable-ft-thread' option?

If so, it looks like I didn't move over the thread_sleep_wait adjustment from 
the trunk - the thread was being a bit too aggressive. Try adding the following 
to your command line options, and see if it changes the performance.
  "-mca opal_cr_thread_sleep_wait 1000"

There are other places to look as well depending on how frequently your 
application communicates, how often you checkpoint, process layout, ... But 
usually the aggressive nature of the thread is the main problem.

Let me know if that helps.

-- Josh

On Feb 8, 2011, at 2:50 AM, Nguyen Toan wrote:

> Hi all,
> 
> I am using the latest version of OpenMPI (1.5.1) and BLCR (0.8.2).
> I found that when running an application,which uses MPI_Isend, MPI_Irecv and 
> MPI_Wait,  
> enabling C/R, i.e using "-am ft-enable-cr", the application runtime is much 
> longer than the normal execution with mpirun (no checkpoint was taken).
> This overhead becomes larger when the normal execution runtime is longer.
> Does anybody have any idea about this overhead, and how to eliminate it?
> Thanks.
> 
> Regards,
> Nguyen
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




Re: [OMPI users] allow job to survive process death

2011-01-27 Thread Joshua Hursey

On Jan 27, 2011, at 9:47 AM, Reuti wrote:

> Am 27.01.2011 um 15:23 schrieb Joshua Hursey:
> 
>> The current version of Open MPI does not support continued operation of an 
>> MPI application after process failure within a job. If a process dies, so 
>> will the MPI job. Note that this is true of many MPI implementations out 
>> there at the moment.
>> 
>> At Oak Ridge National Laboratory, we are working on a version of Open MPI 
>> that will be able to run-through process failure, if the application wishes 
>> to do so. The semantics and interfaces needed to support this functionality 
>> are being actively developed by the MPI Forums Fault Tolerance Working 
>> Group, and can be found at the wiki page below:
>> https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ft/run_through_stabilization
> 
> I had a look at this document, but what is really covered - the application 
> has to react on the notification of a failed rank and act appropriate on its 
> own?

Yes. This is to support application based fault tolerance (ABFT). Libraries 
could be developed on top of these semantics to hide some of the fault handing. 
The purpose is to enable fault tolerant MPI applications and libraries to be 
built on top of MPI.

This document only covers run-through stabilization, not process recovery, at 
the moment. So the application will have well defined semantics to allow it to 
continue processing without the failed process. Recovering the failed process 
is not specified in this document. That is the subject of a supplemental 
document in preparation - the two proposals are meant to be complementary and 
build upon one another.

> 
> Having a true ability to survive a dying process (i.e. rank) which might be 
> computing already for hours would mean to have some kind of "rank RAID" or 
> "rank Parchive". E.g. start 12 ranks when you need 10 - what ever 2 ranks are 
> failing, your job will be ready in time.

Yes, that is one possible technique. So once a process failure occurs, the 
application is notified via the existing error handling mechanisms. The 
application is then responsible for determining how best to recover from that 
process failure. This could include using MPI_Comm_spawn to create new 
processes (useful in manager/worker applications), recovering the state from an 
in-memory checksum, using spare processes in the communicator, rolling back 
some/all ranks to an application level checkpoint, ignoring the failure and 
allowing the residual error to increase, aborting the job or a single 
sub-communicator, ... the list goes on. But the purpose of the proposal is to 
allow an application or library to start building such techniques based on 
portable semantics and well defined interfaces.

Does that help clarify?


If you would like to discuss the developing proposals further or have input on 
how to make it better, I would suggest moving the discussion to the MPI3-ft 
mailing list so other groups can participate that do not normally follow the 
Open MPI lists. The mailing list information is below:
  http://lists.mpi-forum.org/mailman/listinfo.cgi/mpi3-ft


-- Josh

> 
> -- Reuti
> 
> 
>> This work is on-going, but once we have a stable prototype we will assess 
>> how to bring it back to the mainline Open MPI trunk. For the moment, there 
>> is no public release of this branch, but once there is we will be sure to 
>> announce it on the appropriate Open MPI mailing list for folks to start 
>> playing around with it.
>> 
>> -- Josh
>> 
>> On Jan 27, 2011, at 9:11 AM, Kirk Stako wrote:
>> 
>>> Hi,
>>> 
>>> I was wondering what support Open MPI has for allowing a job to
>>> continue running when one or more processes in the job die
>>> unexpectedly? Is there a special mpirun flag for this? Any other ways?
>>> 
>>> It seems obvious that collectives will fail once a process dies, but
>>> would it be possible to create a new group (if you knew which ranks
>>> are dead) that excludes the dead processes - then turn this group into
>>> a working communicator?
>>> 
>>> Thanks,
>>> Kirk
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>> 
>> 
>> Joshua Hursey
>> Postdoctoral Research Associate
>> Oak Ridge National Laboratory
>> http://users.nccs.gov/~jjhursey
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




Re: [OMPI users] allow job to survive process death

2011-01-27 Thread Joshua Hursey
The current version of Open MPI does not support continued operation of an MPI 
application after process failure within a job. If a process dies, so will the 
MPI job. Note that this is true of many MPI implementations out there at the 
moment.

At Oak Ridge National Laboratory, we are working on a version of Open MPI that 
will be able to run-through process failure, if the application wishes to do 
so. The semantics and interfaces needed to support this functionality are being 
actively developed by the MPI Forums Fault Tolerance Working Group, and can be 
found at the wiki page below:
  https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/ft/run_through_stabilization

This work is on-going, but once we have a stable prototype we will assess how 
to bring it back to the mainline Open MPI trunk. For the moment, there is no 
public release of this branch, but once there is we will be sure to announce it 
on the appropriate Open MPI mailing list for folks to start playing around with 
it.

-- Josh

On Jan 27, 2011, at 9:11 AM, Kirk Stako wrote:

> Hi,
> 
> I was wondering what support Open MPI has for allowing a job to
> continue running when one or more processes in the job die
> unexpectedly? Is there a special mpirun flag for this? Any other ways?
> 
> It seems obvious that collectives will fail once a process dies, but
> would it be possible to create a new group (if you knew which ranks
> are dead) that excludes the dead processes - then turn this group into
> a working communicator?
> 
> Thanks,
> Kirk
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




[OMPI users] Fwd: BLCR at SC10

2010-11-14 Thread Joshua Hursey
For those interested in the developing fault tolerance capabilities of Open MPI 
(particularly BLCR and CIFTS FTB support), you may find the events hosted by 
Lawrence Berkeley National Laboratory of interest.

Also the Indiana University booth has a demonstration of "Application-level 
Fault Tolerance in Open MPI through Preemptive Process Migration and 
Resiliency" that may be of interest.
  http://sc10.supercomputing.iu.edu/demos

-- Josh

Begin forwarded message:

> From: "Paul H. Hargrove"
> Date: November 14, 2010 11:23:01 AM CST
> Subject: BLCR at SC10
> 
> Hello BLCR users,
> 
> I am writing to let you all know about some BLCR-related events at SC10 
> in New Orleans this week.
> 
> Tues Nov 16:
>  12:15 to 1:15  CIFTS BoF (CIFTS = Coordinated Infrastructure for Fault 
> Tolerant Systems)
>  3:00 to 4:00 CIFTS round-table discussion in LBNL booth #2448
> 
> Wed Nov 17:
>   3:00 to 4:00 BLCR round-table discussion in LBNL booth #2448
>   5:30 to 6:30 brief BLCR talk at TORQUE BoF
> 
> -Paul
> 
> -- 
> Paul H. Hargrove  phhargr...@lbl.gov
> Future Technologies Group
> HPC Research Department   Tel: +1-510-495-2352
> Lawrence Berkeley National Laboratory Fax: +1-510-486-6900
> 


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://users.nccs.gov/~jjhursey




Re: [OMPI users] Running on crashing nodes

2010-09-24 Thread Joshua Hursey
As one of the Open MPI developers actively working on the MPI layer 
stabilization/recover feature set, I don't think we can give you a specific 
timeframe for availability, especially availability in a stable release. Once 
the initial functionality is finished, we will open it up for user testing by 
making a public branch available. After addressing the concerns highlighted by 
public testing, we will attempt to work this feature into the mainline trunk 
and eventual release.

Unfortunately it is difficult to assess the time needed to go through these 
development stages. What I can tell you is that the work to this point on the 
MPI layer is looking promising, and that as soon as we feel that the code is 
ready we will make it available to the public for further testing.

-- Josh

On Sep 24, 2010, at 3:37 AM, Andrei Fokau wrote:

> Ralph, could you tell us when this functionality will be available in the 
> stable version? A rough estimate will be fine.
> 
> 
> On Fri, Sep 24, 2010 at 01:24, Ralph Castain <r...@open-mpi.org> wrote:
> In a word, no. If a node crashes, OMPI will abort the currently-running job 
> if it had processes on that node. There is no current ability to "ride-thru" 
> such an event.
> 
> That said, there is work being done to support "ride-thru". Most of that is 
> in the current developer's code trunk, and more is coming, but I wouldn't 
> consider it production-quality just yet.
> 
> Specifically, the code that does what you specify below is done and works. It 
> is recovery of the MPI job itself (collectives, lost messages, etc.) that 
> remains to be completed.
> 
> 
> On Thu, Sep 23, 2010 at 7:22 AM, Andrei Fokau <andrei.fo...@neutron.kth.se> 
> wrote:
> Dear users,
> 
> Our cluster has a number of nodes which have high probability to crash, so it 
> happens quite often that calculations stop due to one node getting down. May 
> be you know if it is possible to block the crashed nodes during run-time when 
> running with OpenMPI? I am asking about principal possibility to program such 
> behavior. Does OpenMPI allow such dynamic checking? The scheme I am curious 
> about is the following:
> 
> 1. A code starts its tasks via mpirun on several nodes
> 2. At some moment one node gets down
> 3. The code realizes that the node is down (the results are lost) and 
> excludes it from the list of nodes to run its tasks on
> 4. At later moment the user restarts the crashed node
> 5. The code notices that the node is up again, and puts it back to the list 
> of active nodes
> 
> 
> Regards,
> Andrei
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> _______
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://www.cs.indiana.edu/~jjhursey




Re: [OMPI users] Question on staging in checkpoint

2010-09-13 Thread Joshua Hursey
Adjust the 'filem_rsh_max_incomming' parameter:
 http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-filem_rsh_max_incomming

I defaulted this MCA parameter to 10 since, depending on how big each 
individual checkpoint is, you will find that often sending them all at once is 
often worse than sending only a window of them at a time. I would recommend 
trying a few different values for this parameter and seeing the impact it has 
both on checkpoint overhead (additional application overhead) and checkpoint 
latency (the time it takes for the checkpoint to completely finish).

-- Josh

On Sep 13, 2010, at 7:42 PM, <ananda.mu...@wipro.com> <ananda.mu...@wipro.com> 
wrote:

> Hi
>  
> I was trying out the staging option in checkpoint where I save the checkpoint 
> image in local file system and have the image transferred to global 
> filesystem in the background. As part of the background process I see that 
> the “scp” command is launched to transfer the images from local file system 
> to global file system. I am using openmpi-1.5rc6 with BLCR 0.8.2.
>  
> In my experiment, I had about 128 cores saved their respective checkpoint 
> images on local file system. During the background process, I see that only 
> 10 “scp” requests are sent at a time. Is this a configurable parameter? Since 
> these commands will run on respective nodes, how can I launch all 128 scp 
> requests (to take care of all 128 images in my experiment) simultaneously?
>  
> Thanks
> Ananda
> Please do not print this email unless it is absolutely necessary.
> 
> The information contained in this electronic message and any attachments to 
> this message are intended for the exclusive use of the addressee(s) and may 
> contain proprietary, confidential or privileged information. If you are not 
> the intended recipient, you should not disseminate, distribute or copy this 
> e-mail. Please notify the sender immediately and destroy all copies of this 
> message and any attachments.
> 
> WARNING: Computer viruses can be transmitted via email. The recipient should 
> check this email and any attachments for the presence of viruses. The company 
> accepts no liability for any damage caused by any virus transmitted by this 
> email.
> 
> www.wipro.com
> 
> 


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://www.cs.indiana.edu/~jjhursey




Re: [OMPI users] High Checkpoint Overhead Ratio

2010-08-31 Thread Joshua Hursey
Have you tried testing without using the NFS? So setting the mca-params.conf to 
something like:
crs_base_snapshot_dir=/tmp/
snapc_base_global_snapshot_dir=/tmp/global
snapc_basee_store_in_place=0

This would remove the NFS time from the checkpoint time. However if you are 
using staging this may or may not reduce the application overhead significantly.

If you want to save to NFS, and it is globally mounted you could try setting 
the 'snapc_base_global_shared' parameter (deprecated in the trunk) which tells 
the system to use standard UNIX copy commands (i.e., cp) instead of the rsh 
varieties.

You might try changing the '--mca filem_rsh_max_incomming' parameter (default 
10) to increase or decrease the number of concurrent rcp/scp operations.

Something else to try is to look at the SnapC timing to pinpoint where the 
system is taking the most time:
  snapc_full_enable_timing=1

Dince you are using the C/R thread, it takes up some CPU cycles that may 
interfere with application performance. You can adjust the agressiveness of 
this thread by adjusting the 'opal_cr_thread_sleep_wait' parameter. In 1.5.0 it 
defaults to 0 microseconds, but on the trunk this has been adjusted to 1000 
microseconds. Try setting the parameter:
  opal_cr_thread_sleep_wait=1000

Depending on how much memory is required by CG.C and available on each node, 
you may be hitting a memory barrier that BLCR is struggling to overcome. What 
happens if you reduce the number of processes per node?

Those are some things to play around with to see what works best for your 
system and application. For a full list of parameters available in the C/R 
infrastructure see the link below:
  http://osl.iu.edu/research/ft/ompi-cr/api.php

-- Josh

On Aug 30, 2010, at 11:08 PM, 陈文浩 wrote:

> Dear OMPI Users,
>  
> I’m now using BLCR-0.8.2 and OpenMPI-1.5rc5. The problem is that it takes a 
> very long time to checkpoint.
>  
> BLCR configuration:
> ./onfigure --prefix=/opt/blcr --enable-static
> OpenMPi configuration:
> ./configure --prefix=/opt/ompi --with-ft=cr --with-blcr=/opt/blcr 
> --enable-static  --enable-ft-thread --enable-mpi-threads
>  
> Our blades use NFS. $HOME and /opt are shared.
>  
> In $HOME/.opnempi/mca-params.conf:
> crs_base_snapshot_dir=/tmp/
> snapc_base_global_snapshot_dir=/home/chenwh
> snapc_basee_store_in_place=0
>  
>  
> Now I run CG NPB (NPROCS=16, CLASS=C) on two nodes (blade02, blade04).
> With no checkpoint, 'Time in seconds' is about 100s. It's normal.
> But when I take a single checkpoint, 'Time in seconds' is up to 300s. The 
> overhead ratio is over 200%! WHY? How can I improve it?
>  
> blade02:~> ompi-checkpoint --status 27115
> [blade02:27130] [  0.00 /   0.25] Requested - ...
> [blade02:27130] [  0.00 /   0.25]   Pending - ...
> [blade02:27130] [  0.21 /   0.46]   Running - ...
> [blade02:27130] [221.25 / 221.71]  Finished - 
> ompi_global_snapshot_27115.ckpt
> Snapshot Ref.:   0 ompi_global_snapshot_27115.ckpt
>  
> As you see, it takes 200+ secconds to checkpoint. btw, what the former and 
> latter number represent in [ , ]?
>  
> Regards
>  
> Whchen
> 


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://www.cs.indiana.edu/~jjhursey







Re: [OMPI users] OpenMPI with BLCR runtime problem

2010-08-24 Thread Joshua Hursey

On Aug 24, 2010, at 10:27 AM, 陈文浩 wrote:

> Dear OMPI users,
>  
> I configured and installed OpenMPI-1.4.2 and BLCR-0.8.2. (blade01 �C blade10, 
> nfs)
> BLCR configure script: ./configure �Cprefix=/opt/blcr �Cenable-static
> After the installation, I can see the ‘blcr’ module loaded correctly (lsmod | 
> grep blcr). And I can also run ‘cr_run’, ‘cr_checkpoint’, ‘cr_restart’ to C/R 
> the examples correctly under /blcr/examples/.
> Then, OMPI configure script is: ./configure �Cprefix=/opt/ompi �Cwith-ft=cr 
> �Cwith-blcr=/opt/blcr �Cenable-ft-thread �Cenable-mpi-threads �Cenable-static
> The installation is okay too.
>  
> Then here comes the problem.
> On one node:
>  mpirun -np 2 ./hello_c.c
>  mpirun -np 2 �Cam ft-enable-cr ./hello_c.c
>  are both okay.
> On two nodes(blade01, blade02):
>  mpirun �Cnp 2 �Cmachinefile mf ./hello_c.c  OK.
> mpirun �Cnp 2 �Cmachinefile mf �Cam ft-enable-cr ./hello_c.c ERROR. Listed 
> below:
>  
> *** An error occurred in MPI_Init 
> *** before MPI was initialized 
> *** MPI_ERRORS_ARE_FATAL (your MPI job will now abort) 
> [blade02:28896] Abort before MPI_INIT completed successfully; not able to 
> guarantee that all other processes were killed! 
> -- 
> It looks like opal_init failed for some reason; your parallel process is 
> likely to abort. There are many reasons that a parallel process can 
> fail during opal_init; some of which are due to configuration or 
> environment problems. This failure appears to be an internal failure; 
> here's some additional information (which may only be relevant to an 
> Open MPI developer):
>   opal_cr_init() failed failed 
>   --> Returned value -1 instead of OPAL_SUCCESS 
> -- 
> [blade02:28896] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file 
> runtime/orte_init.c at line 77 
> -- 
> It looks like MPI_INIT failed for some reason; your parallel process is 
> likely to abort. There are many reasons that a parallel process can 
> fail during MPI_INIT; some of which are due to configuration or environment 
> problems. This failure appears to be an internal failure; here's some 
> additional information (which may only be relevant to an Open MPI 
> developer):
>   ompi_mpi_init: orte_init failed 
>   --> Returned "Error" (-1) instead of "Success" (0) 
> --
>  
> I have no idea about the error. Our blades use nfs, does it matter? Can 
> anyone help me solve the problem? I really appreciate it. Thank you.
>  
> btw, similar error like:
> “Oops, cr_init() failed (the initialization call to the BLCR checkpointing 
> system). Abort in despair.
> The crmpi SSI subsystem failed to initialized modules successfully during 
> MPI_INIT. This is a fatal error; I must abort.” occurs when I use LAM/MPI + 
> BLCR.

This seems to indicate that BLCR is not working correctly on one of the compute 
nodes. Did you try some of the BLCR example programs on both of the compute 
nodes? If BLCRs cr_init() fails, then there is not much the MPI library can do 
for you.

I would check the installation of BLCR on all of the compute nodes (blade01 and 
blade02). Make sure the modules are loaded and that the BLCR single process 
examples work on all nodes. I suspect that one of the nodes is having trouble 
initializing the BLCR library.

You may also want to check to make sure prelinking is turned off on all nodes 
as well:
  https://upc-bugs.lbl.gov//blcr/doc/html/FAQ.html#prelink

If that doesn't work then I would suggest trying the current Open MPI trunk. 
There should not be any problem with using NFS, since this is occurring in 
MPI_Init, this is well before we ever try to use the file system. I also test 
with NFS, and local staging on a fairly regular basis, so it shouldn't be a 
problem even when checkpointing/restarting.

-- Josh

>  
> Regards
>  
> whchen
>  
> 


Joshua Hursey
Postdoctoral Research Associate
Oak Ridge National Laboratory
http://www.cs.indiana.edu/~jjhursey







[OMPI users] Checkpoint/Restart Process Migration and Automatic Recovery Support

2010-08-19 Thread Joshua Hursey
I am pleased to announce that Open MPI now supports checkpoint/restart process 
migration and automatic recovery. This is in addition to our current support 
for more traditional checkpoint/restart fault tolerance. These new features 
were introduced in the Open MPI development trunk in commit r23587. These new 
features are currently being scheduled for release in the v1.5.1 release of 
Open MPI.

In addition to the two features mentioned above, this commit also includes 
support for C/R-enabled parallel debugging (Documentation for this feature to 
become available in September). This commit also introduces an API for the C/R 
functionality allowing applications to request a checkpoint/restart/migration 
from within their application. We also abstracted the stable storage technique 
and added support for checkpoint caching and compression. So lots of good stuff 
to play with.

At the bottom of this email is a list of the new features, major/minor changes 
and bug fixes that are included in this release. The current implementation 
deprecates some MCA parameters to make way for a more extensible C/R 
infrastructure. So please check the online documentation for information on how 
to use this new functionality. Documentation is available at the link below:
 http://osl.iu.edu/research/ft/

If you have any questions or problems using these new features please send them 
to the users list.

Enjoy :)
Josh

--
Major Changes: 
-- 
* Add two new ErrMgr recovery policies to the 'hnp' ErrMgr component
  * {{{crmig}}} C/R Process Migration 
  * {{{autor}}} C/R Automatic Recovery 
* Added C/R-enabled Debugging support. 
  Enabled with the --enable-crdebug flag. See the following website for more 
information: 
  http://osl.iu.edu/research/ft/crdebug/ 
* Added Stable Storage (SStore) framework for checkpoint storage 
  * 'central' component does a direct to central storage save 
  * 'stage' component stages checkpoints to central storage while the 
application continues execution. 
* 'stage' supports offline compression of checkpoints before moving 
(sstore_stage_compress) 
* 'stage' supports local caching of checkpoints to improve automatic 
recovery (sstore_stage_caching) 
* Added Compression (compress) framework to support 
* Added the {{{ompi-migrate}}} command line tool to support the {{{crmig}}} 
ErrMgr recovery policy 
* Added CR MPI Ext functions (enable them with {{{--enable-mpi-ext=cr}}} 
configure option) 
  * {{{OMPI_CR_Checkpoint}}} (Fixes #2342) 
  * {{{OMPI_CR_Restart}}} 
  * {{{OMPI_CR_Migrate}}} (may need some more work for mapping rules) 
  * {{{OMPI_CR_INC_register_callback}}} (Fixes #2192) 
  * {{{OMPI_CR_Quiesce_start}}} 
  * {{{OMPI_CR_Quiesce_checkpoint}}} 
  * {{{OMPI_CR_Quiesce_end}}} 
  * {{{OMPI_CR_self_register_checkpoint_callback}}} 
  * {{{OMPI_CR_self_register_restart_callback}}} 
  * {{{OMPI_CR_self_register_continue_callback}}} 
* The ErrMgr predicted_fault() interface has been changed to take an 
opal_list_t of ErrMgr defined types. This will allow us to better support a 
wider range of fault prediction services in the future. 
* Add a progress meter to: 
  * FileM rsh (filem_rsh_process_meter) 
  * SnapC full (snapc_full_progress_meter) 
  * SStore stage (sstore_stage_progress_meter) 
* Added 2 new command line options to ompi-restart 
  * --showme : Display the full command line that would have been exec'ed. 
  * --mpirun_opts : Command line options to pass directly to mpirun. (Fixes 
#2413) 
* Deprecated some MCA params: 
  * crs_base_snapshot_dir deprecated, use sstore_stage_local_snapshot_dir 
  * snapc_base_global_snapshot_dir deprecated, use 
sstore_base_global_snapshot_dir 
  * snapc_base_global_shared deprecated, use sstore_stage_global_is_shared 
  * snapc_base_store_in_place deprecated, replaced with different components of 
SStore 
  * snapc_base_global_snapshot_ref deprecated, use 
sstore_base_global_snapshot_ref 
  * snapc_base_establish_global_snapshot_dir deprecated, never well supported 
  * snapc_full_skip_filem deprecated, use sstore_stage_skip_filem 

Minor Changes: 
-- 
* Fixes #1924 : {{{ompi-restart}}} now recognizes path prefixed checkpoint 
handles and does the right thing. 
* Fixes #2097 : {{{ompi-info}}} should now report all available CRS components 
* Fixes #2161 : Manual checkpoint movement. A user can 'mv' a checkpoint 
directory from the original location to another and still restart from it. 
* Fixes #2208 : Honor various TMPDIR varaibles instead of forcing {{{/tmp}}} 
* Move {{{ompi_cr_continue_like_restart}}} to 
{{{orte_cr_continue_like_restart}}} to be more flexible in where this should be 
set. 
* opal_crs_base_metadata_write* functions have been moved to SStore to support 
a wider range of metadata handling functionality. 
* Cleanup the CRS framework and components to work with the SStore framework. 
* Cleanup the SnapC framework and components to work with 

Re: [OMPI users] Checkpointing mpi4py program

2010-08-18 Thread Joshua Hursey
I just fixed the --stop bug that you highlighted in r23627.

As far as the mpi4py program, I don't really know what to suggest. I don't have 
a setup to test this locally and am completely unfamiliar with mpi4py. Can you 
reproduce this with just a C program?

-- Josh

On Aug 16, 2010, at 12:25 PM, <ananda.mu...@wipro.com> <ananda.mu...@wipro.com> 
wrote:

> Josh
>  
> I have one more update on my observation while analyzing this issue.
>  
> Just to refresh, I am using openmpi-trunk release 23596 with mpi4py-1.2.1 and 
> BLCR 0.8.2. When I checkpoint the python script written using mpi4py, the 
> program doesn’t progress after the checkpoint is taken successfully. I tried 
> it with openmpi 1.4.2 and then tried it with the latest trunk version as 
> suggested. I see the similar behavior in both the releases.
>  
> I have one more interesting observation which I thought may be useful. I 
> tried the “-stop” option of ompi-checkpoint (trunk version) and the mpirun 
> prints the following error messages when I run the command “ompi-checkpoint 
> –stop –v ”:
>  
>  Error messages in the window where mpirun command was running START 
> ==
> [hpdcnln001:15148] Error: (   app) Passed an invalid handle (0) [5 
> ="/tmp/openmpi-sessions-amudar@hpdcnln001_0/37739/1"]
> [hpdcnln001:15148] [[37739,1],2] ORTE_ERROR_LOG: Error in file 
> ../../../../../orte/mca/sstore/central/sstore_central_module.c at line 253
> [hpdcnln001:15149] Error: (   app) Passed an invalid handle (0) [5 
> ="/tmp/openmpi-sessions-amudar@hpdcnln001_0/37739/1"]
> [hpdcnln001:15149] [[37739,1],3] ORTE_ERROR_LOG: Error in file 
> ../../../../../orte/mca/sstore/central/sstore_central_module.c at line 253
> [hpdcnln001:15146] Error: (   app) Passed an invalid handle (0) [5 
> ="/tmp/openmpi-sessions-amudar@hpdcnln001_0/37739/1"]
> [hpdcnln001:15146] [[37739,1],0] ORTE_ERROR_LOG: Error in file 
> ../../../../../orte/mca/sstore/central/sstore_central_module.c at line 253
> [hpdcnln001:15147] Error: (   app) Passed an invalid handle (0) [5 
> ="/tmp/openmpi-sessions-amudar@hpdcnln001_0/37739/1"]
> [hpdcnln001:15147] [[37739,1],1] ORTE_ERROR_LOG: Error in file 
> ../../../../../orte/mca/sstore/central/sstore_central_module.c at line 253
>  Error messages in the window where mpirun command was running END 
> ==
>  
> Please note that the checkpoint image was created at the end of it. However 
> when I run the command “kill –CONT ”, it fails to move forward 
> which is same as the original problem I have reported.
>  
> Let me know if you need any additional information.
>  
> Thanks for your time in advance
>  
> -  Ananda
>  
> Ananda B Mudar, PMP
> Senior Technical Architect
> Wipro Technologies
> Ph: 972 765 8093
> ananda.mu...@wipro.com
>  
> From: Ananda Babu Mudar (WT01 - Energy and Utilities) 
> Sent: Sunday, August 15, 2010 11:25 PM
> To: us...@open-mpi.org
> Subject: Re: [OMPI users] Checkpointing mpi4py program
> Importance: High
>  
> Josh
> 
> I tried running the mpi4py program with the latest trunk version of openmpi. 
> I have compiled openmpi-1.7a1r23596 from trunk and recompiled mpi4py to use 
> this library. Unfortunately I see the same behavior as I have seen with 
> openmpi 1.4.2 ie; checkpoint will be successful but the program doesn’t 
> proceed after that.
> 
> I have attached the stack traces of all the MPI processes that are part of 
> the mpirun. I really appreciate if you can take a look at the stack trace and 
> let m e know the potential problem. I am kind of stuck at this point and need 
> your assistance to move forward. Please let me know if you need any 
> additional information.
> 
> Thanks for your time in advance
> 
> Thanks
> 
> Ananda
> 
> -Original Message- 
> Subject: Re: [OMPI users] Checkpointing mpi4py program
> From: Joshua Hursey (jjhursey_at_[hidden])
> Date: 2010-08-13 12:28:31
> 
> Nope. I probably won't get to it for a while. I'll let you know if I do.
> 
> On Aug 13, 2010, at 12:17 PM, <ananda.mudar_at_[hidden]> 
> <ananda.mudar_at_[hidden]> wrote:
> 
> > OK, I will do that. 
> > 
> > But did you try this program on a system where the latest trunk is 
> > installed? Were you successful in checkpointing? 
> > 
> > - Ananda 
> > -Original Message- 
> > Message: 9 
> > Date: Fri, 13 Aug 2010 10:21:29 -0400 
> > From: Joshua Hursey <jjhursey_at_[hidden]> 
> > Subject: Re: [OMPI users] users Digest, Vol 1658, Issue 2 
> > To: Open MPI Users <users_at_[hidden]> 
> > Message-ID: <7A43615B-A462-4C

Re: [OMPI users] Checkpointing mpi4py program

2010-08-13 Thread Joshua Hursey
Nope. I probably won't get to it for a while. I'll let you know if I do.

On Aug 13, 2010, at 12:17 PM, <ananda.mu...@wipro.com> <ananda.mu...@wipro.com> 
wrote:

> OK, I will do that.
> 
> But did you try this program on a system where the latest trunk is
> installed? Were you successful in checkpointing?
> 
> - Ananda
> -Original Message-
> Message: 9
> Date: Fri, 13 Aug 2010 10:21:29 -0400
> From: Joshua Hursey <jjhur...@open-mpi.org>
> Subject: Re: [OMPI users] users Digest, Vol 1658, Issue 2
> To: Open MPI Users <us...@open-mpi.org>
> Message-ID: <7a43615b-a462-4c72-8112-496653d8f...@open-mpi.org>
> Content-Type: text/plain; charset=us-ascii
> 
> I probably won't have an opportunity to work on reproducing this on the
> 1.4.2. The trunk has a bunch of bug fixes that probably will not be
> backported to the 1.4 series (things have changed too much since that
> branch). So I would suggest trying the 1.5 series.
> 
> -- Josh
> 
> On Aug 13, 2010, at 10:12 AM, <ananda.mu...@wipro.com>
> <ananda.mu...@wipro.com> wrote:
> 
>> Josh
>> 
>> I am having problems compiling the sources from the latest trunk. It
>> complains of libgomp.spec missing even though that file exists on my
>> system. I will see if I have to change any other environment variables
>> to have a successful compilation. I will keep you posted.
>> 
>> BTW, were you successful in reproducing the problem on a system with
>> OpenMPI 1.4.2?
>> 
>> Thanks
>> Ananda
>> -Original Message-
>> Date: Thu, 12 Aug 2010 09:12:26 -0400
>> From: Joshua Hursey <jjhur...@open-mpi.org>
>> Subject: Re: [OMPI users] Checkpointing mpi4py program
>> To: Open MPI Users <us...@open-mpi.org>
>> Message-ID: <1f1445ab-9208-4ef0-af25-5926bd53c...@open-mpi.org>
>> Content-Type: text/plain; charset=us-ascii
>> 
>> Can you try this with the current trunk (r23587 or later)?
>> 
>> I just added a number of new features and bug fixes, and I would be
>> interested to see if it fixes the problem. In particular I suspect
> that
>> this might be related to the Init/Finalize bounding of the checkpoint
>> region.
>> 
>> -- Josh
>> 
>> On Aug 10, 2010, at 2:18 PM, <ananda.mu...@wipro.com>
>> <ananda.mu...@wipro.com> wrote:
>> 
>>> Josh
>>> 
>>> Please find attached is the python program that reproduces the hang
>> that
>>> I described. Initial part of this file describes the prerequisite
>>> modules and the steps to reproduce the problem. Please let me know if
>>> you have any questions in reproducing the hang.
>>> 
>>> Please note that, if I add the following lines at the end of the
>> program
>>> (in case sleep_time is True), the problem disappears ie; program
>> resumes
>>> successfully after successful completion of checkpoint.
>>> # Add following lines at the end for sleep_time is True
>>> else:
>>> time.sleep(0.1)
>>> # End of added lines
>>> 
>>> 
>>> Thanks a lot for your time in looking into this issue.
>>> 
>>> Regards
>>> Ananda
>>> 
>>> Ananda B Mudar, PMP
>>> Senior Technical Architect
>>> Wipro Technologies
>>> Ph: 972 765 8093
>>> ananda.mu...@wipro.com
>>> 
>>> 
>>> -Original Message-
>>> Date: Mon, 9 Aug 2010 16:37:58 -0400
>>> From: Joshua Hursey <jjhur...@open-mpi.org>
>>> Subject: Re: [OMPI users] Checkpointing mpi4py program
>>> To: Open MPI Users <us...@open-mpi.org>
>>> Message-ID: <270bd450-743a-4662-9568-1fedfcc6f...@open-mpi.org>
>>> Content-Type: text/plain; charset=windows-1252
>>> 
>>> I have not tried to checkpoint an mpi4py application, so I cannot say
>>> for sure if it works or not. You might be hitting something with the
>>> Python runtime interacting in an odd way with either Open MPI or
> BLCR.
>>> 
>>> Can you attach a debugger and get a backtrace on a stuck checkpoint?
>>> That might show us where things are held up.
>>> 
>>> -- Josh
>>> 
>>> 
>>> On Aug 9, 2010, at 4:04 PM, <ananda.mu...@wipro.com>
>>> <ananda.mu...@wipro.com> wrote:
>>> 
>>>> Hi
>>>> 
>>>> I have integrated mpi4py with openmpi 1.4.2 that was built with BLCR
>>> 0.8.2. When I run ompi-checkpoint on the program written using
> mpi4py,
>> I
>>> see that program doesn?t resu

Re: [OMPI users] users Digest, Vol 1658, Issue 2

2010-08-13 Thread Joshua Hursey
I probably won't have an opportunity to work on reproducing this on the 1.4.2. 
The trunk has a bunch of bug fixes that probably will not be backported to the 
1.4 series (things have changed too much since that branch). So I would suggest 
trying the 1.5 series.

-- Josh

On Aug 13, 2010, at 10:12 AM, <ananda.mu...@wipro.com> <ananda.mu...@wipro.com> 
wrote:

> Josh
> 
> I am having problems compiling the sources from the latest trunk. It
> complains of libgomp.spec missing even though that file exists on my
> system. I will see if I have to change any other environment variables
> to have a successful compilation. I will keep you posted.
> 
> BTW, were you successful in reproducing the problem on a system with
> OpenMPI 1.4.2?
> 
> Thanks
> Ananda
> -Original Message-----
> Date: Thu, 12 Aug 2010 09:12:26 -0400
> From: Joshua Hursey <jjhur...@open-mpi.org>
> Subject: Re: [OMPI users] Checkpointing mpi4py program
> To: Open MPI Users <us...@open-mpi.org>
> Message-ID: <1f1445ab-9208-4ef0-af25-5926bd53c...@open-mpi.org>
> Content-Type: text/plain; charset=us-ascii
> 
> Can you try this with the current trunk (r23587 or later)?
> 
> I just added a number of new features and bug fixes, and I would be
> interested to see if it fixes the problem. In particular I suspect that
> this might be related to the Init/Finalize bounding of the checkpoint
> region.
> 
> -- Josh
> 
> On Aug 10, 2010, at 2:18 PM, <ananda.mu...@wipro.com>
> <ananda.mu...@wipro.com> wrote:
> 
>> Josh
>> 
>> Please find attached is the python program that reproduces the hang
> that
>> I described. Initial part of this file describes the prerequisite
>> modules and the steps to reproduce the problem. Please let me know if
>> you have any questions in reproducing the hang.
>> 
>> Please note that, if I add the following lines at the end of the
> program
>> (in case sleep_time is True), the problem disappears ie; program
> resumes
>> successfully after successful completion of checkpoint.
>> # Add following lines at the end for sleep_time is True
>> else:
>>  time.sleep(0.1)
>> # End of added lines
>> 
>> 
>> Thanks a lot for your time in looking into this issue.
>> 
>> Regards
>> Ananda
>> 
>> Ananda B Mudar, PMP
>> Senior Technical Architect
>> Wipro Technologies
>> Ph: 972 765 8093
>> ananda.mu...@wipro.com
>> 
>> 
>> -Original Message-
>> Date: Mon, 9 Aug 2010 16:37:58 -0400
>> From: Joshua Hursey <jjhur...@open-mpi.org>
>> Subject: Re: [OMPI users] Checkpointing mpi4py program
>> To: Open MPI Users <us...@open-mpi.org>
>> Message-ID: <270bd450-743a-4662-9568-1fedfcc6f...@open-mpi.org>
>> Content-Type: text/plain; charset=windows-1252
>> 
>> I have not tried to checkpoint an mpi4py application, so I cannot say
>> for sure if it works or not. You might be hitting something with the
>> Python runtime interacting in an odd way with either Open MPI or BLCR.
>> 
>> Can you attach a debugger and get a backtrace on a stuck checkpoint?
>> That might show us where things are held up.
>> 
>> -- Josh
>> 
>> 
>> On Aug 9, 2010, at 4:04 PM, <ananda.mu...@wipro.com>
>> <ananda.mu...@wipro.com> wrote:
>> 
>>> Hi
>>> 
>>> I have integrated mpi4py with openmpi 1.4.2 that was built with BLCR
>> 0.8.2. When I run ompi-checkpoint on the program written using mpi4py,
> I
>> see that program doesn?t resume sometimes after successful checkpoint
>> creation. This doesn?t occur always meaning the program resumes after
>> successful checkpoint creation most of the time and completes
>> successfully. Has anyone tested the checkpoint/restart functionality
>> with mpi4py programs? Are there any best practices that I should keep
> in
>> mind while checkpointing mpi4py programs?
>>> 
>>> Thanks for your time
>>> -  Ananda
>>> Please do not print this email unless it is absolutely necessary.
>>> 
>>> The information contained in this electronic message and any
>> attachments to this message are intended for the exclusive use of the
>> addressee(s) and may contain proprietary, confidential or privileged
>> information. If you are not the intended recipient, you should not
>> disseminate, distribute or copy this e-mail. Please notify the sender
>> immediately and destroy all copies of this message and any
> attachments.
>>> 
>>> WARNING: Computer viruses can be transmitted via email. The recipient
>> sh

Re: [OMPI users] Checkpointing mpi4py program

2010-08-12 Thread Joshua Hursey
Can you try this with the current trunk (r23587 or later)?

I just added a number of new features and bug fixes, and I would be interested 
to see if it fixes the problem. In particular I suspect that this might be 
related to the Init/Finalize bounding of the checkpoint region.

-- Josh

On Aug 10, 2010, at 2:18 PM, <ananda.mu...@wipro.com> <ananda.mu...@wipro.com> 
wrote:

> Josh
> 
> Please find attached is the python program that reproduces the hang that
> I described. Initial part of this file describes the prerequisite
> modules and the steps to reproduce the problem. Please let me know if
> you have any questions in reproducing the hang.
> 
> Please note that, if I add the following lines at the end of the program
> (in case sleep_time is True), the problem disappears ie; program resumes
> successfully after successful completion of checkpoint.
> # Add following lines at the end for sleep_time is True
> else:
>   time.sleep(0.1)
> # End of added lines
> 
> 
> Thanks a lot for your time in looking into this issue.
> 
> Regards
> Ananda
> 
> Ananda B Mudar, PMP
> Senior Technical Architect
> Wipro Technologies
> Ph: 972 765 8093
> ananda.mu...@wipro.com
> 
> 
> -Original Message-
> Date: Mon, 9 Aug 2010 16:37:58 -0400
> From: Joshua Hursey <jjhur...@open-mpi.org>
> Subject: Re: [OMPI users] Checkpointing mpi4py program
> To: Open MPI Users <us...@open-mpi.org>
> Message-ID: <270bd450-743a-4662-9568-1fedfcc6f...@open-mpi.org>
> Content-Type: text/plain; charset=windows-1252
> 
> I have not tried to checkpoint an mpi4py application, so I cannot say
> for sure if it works or not. You might be hitting something with the
> Python runtime interacting in an odd way with either Open MPI or BLCR.
> 
> Can you attach a debugger and get a backtrace on a stuck checkpoint?
> That might show us where things are held up.
> 
> -- Josh
> 
> 
> On Aug 9, 2010, at 4:04 PM, <ananda.mu...@wipro.com>
> <ananda.mu...@wipro.com> wrote:
> 
>> Hi
>> 
>> I have integrated mpi4py with openmpi 1.4.2 that was built with BLCR
> 0.8.2. When I run ompi-checkpoint on the program written using mpi4py, I
> see that program doesn?t resume sometimes after successful checkpoint
> creation. This doesn?t occur always meaning the program resumes after
> successful checkpoint creation most of the time and completes
> successfully. Has anyone tested the checkpoint/restart functionality
> with mpi4py programs? Are there any best practices that I should keep in
> mind while checkpointing mpi4py programs?
>> 
>> Thanks for your time
>> -  Ananda
>> Please do not print this email unless it is absolutely necessary.
>> 
>> The information contained in this electronic message and any
> attachments to this message are intended for the exclusive use of the
> addressee(s) and may contain proprietary, confidential or privileged
> information. If you are not the intended recipient, you should not
> disseminate, distribute or copy this e-mail. Please notify the sender
> immediately and destroy all copies of this message and any attachments.
>> 
>> WARNING: Computer viruses can be transmitted via email. The recipient
> should check this email and any attachments for the presence of viruses.
> The company accepts no liability for any damage caused by any virus
> transmitted by this email.
>> 
>> www.wipro.com
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> 
> --
> 
> Message: 8
> Date: Mon, 9 Aug 2010 13:50:03 -0700
> From: John Hsu <john...@willowgarage.com>
> Subject: Re: [OMPI users] deadlock in openmpi 1.5rc5
> To: Open MPI Users <us...@open-mpi.org>
> Message-ID:
>   

Re: [OMPI users] Checkpointing mpi4py program

2010-08-09 Thread Joshua Hursey
I have not tried to checkpoint an mpi4py application, so I cannot say for sure 
if it works or not. You might be hitting something with the Python runtime 
interacting in an odd way with either Open MPI or BLCR.

Can you attach a debugger and get a backtrace on a stuck checkpoint? That might 
show us where things are held up.

-- Josh


On Aug 9, 2010, at 4:04 PM,   
wrote:

> Hi
>  
> I have integrated mpi4py with openmpi 1.4.2 that was built with BLCR 0.8.2. 
> When I run ompi-checkpoint on the program written using mpi4py, I see that 
> program doesn’t resume sometimes after successful checkpoint creation. This 
> doesn’t occur always meaning the program resumes after successful checkpoint 
> creation most of the time and completes successfully. Has anyone tested the 
> checkpoint/restart functionality with mpi4py programs? Are there any best 
> practices that I should keep in mind while checkpointing mpi4py programs?
>  
> Thanks for your time
> -  Ananda
> Please do not print this email unless it is absolutely necessary.
> 
> The information contained in this electronic message and any attachments to 
> this message are intended for the exclusive use of the addressee(s) and may 
> contain proprietary, confidential or privileged information. If you are not 
> the intended recipient, you should not disseminate, distribute or copy this 
> e-mail. Please notify the sender immediately and destroy all copies of this 
> message and any attachments.
> 
> WARNING: Computer viruses can be transmitted via email. The recipient should 
> check this email and any attachments for the presence of viruses. The company 
> accepts no liability for any damage caused by any virus transmitted by this 
> email.
> 
> www.wipro.com
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Segmentation fault (11)

2010-03-31 Thread Joshua Hursey
That is interesting. I cannot think of any reason why this might be causing a 
problem just in Open MPI. popen() is similar to fork()/system() so you have to 
be careful with interconnects that do not play nice with fork(), like openib. 
But since it looks like you are excluding openib, this should not be the 
problem.

I wonder if this has something to so with the way we use BLCR (maybe we need to 
pass additional parameters to cr_checkpoint()). When the process fails, are 
there any messages in the system logs from BLCR indicating an issue that it 
encountered? It is common for BLCR to post a 'socket open' warning, but that is 
expected/normal since we leave TCP sockets open in most cases as an 
optimization. I am wondering if there is a warning about the popen'ed process.

Personally, I will not have an opportunity to look into this in more detail 
until probably mid-April. :/

Let me know what you find, and maybe we can sort out what is happening on the 
list.

-- Josh

On Mar 29, 2010, at 2:28 PM, Jean Potsam wrote:

> Hi Josh/All,
>I just tested a simple c application with blcr and it worked 
> fine.
>  
> ##
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include
> #include  
> #include 
> 
> char * getprocessid() 
> {
> FILE * read_fp;
> char buffer[BUFSIZ + 1];
> int chars_read;
> char * buffer_data="12345";
> memset(buffer, '\0', sizeof(buffer));
>   read_fp = popen("uname -a", "r");
>  /*
>   ...
>  */ 
>  return buffer_data;
> }
>  
> int main(int argc, char ** argv)
> {
> 
>  int rank;
>int size;
> char * thedata;
> int n=0;
>  thedata=getprocessid();
>  printf(" the data is %s", thedata);
>  
>   while( n <10)
>   {
> printf("value is %d\n", n);
> n++;
> sleep(1);
>}
>  printf("bye\n");
>  
> }
>  
>  
> jean@sun32:/tmp$ cr_run ./pipetest3 &
> [1] 31807
> jean@sun32:~$  the data is 12345value is 0
> value is 1
> value is 2
> ...
> value is 9
> bye
>  
> jean@sun32:/tmp$ cr_checkpoint 31807
>  
> jean@sun32:/tmp$ cr_restart context.31807
> value is 7
> value is 8
> value is 9
> bye
>  
> ##
>  
>  
> It looks like its more to do with Openmpi.  Any ideas from you side?
>  
> Thank you.
>  
> Kind regards,
>  
> Jean.
>  
>  
>  
> 
> 
> --- On Mon, 29/3/10, Josh Hursey  wrote:
> 
> From: Josh Hursey 
> Subject: Re: [OMPI users] Segmentation fault (11)
> To: "Open MPI Users" 
> Date: Monday, 29 March, 2010, 16:08
> 
> I wonder if this is a bug with BLCR (since the segv stack is in the BLCR 
> thread). Can you try an non-MPI version of this application that uses 
> popen(), and see if BLCR properly checkpoints/restarts it?
> 
> If so, we can start to see what Open MPI might be doing to confuse things, 
> but I suspect that this might be a bug with BLCR. Either way let us know what 
> you find out.
> 
> Cheers,
> Josh
> 
> On Mar 27, 2010, at 6:17 AM, jody wrote:
> 
> > I'm not sure if this is the cause of your problems:
> > You define the constant BUFFER_SIZE, but in the code you use a constant 
> > called BUFSIZ...
> > Jody
> > 
> > 
> > On Fri, Mar 26, 2010 at 10:29 PM, Jean Potsam  
> > wrote:
> > Dear All,
> >   I am having a problem with openmpi . I have installed openmpi 
> > 1.4 and blcr 0.8.1
> > 
> > I have written a small mpi application as follows below:
> > 
> > ###
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include 
> > #include
> > #include 
> > #include 
> > 
> > #define BUFFER_SIZE PIPE_BUF
> > 
> > char * getprocessid()
> > {
> > FILE * read_fp;
> > char buffer[BUFSIZ + 1];
> > int chars_read;
> > char * buffer_data="12345";
> > memset(buffer, '\0', sizeof(buffer));
> >   read_fp = popen("uname -a", "r");
> >  /*
> >   ...
> >  */
> >  return buffer_data;
> > }
> > 
> > int main(int argc, char ** argv)
> > {
> >   MPI_Status status;
> >  int rank;
> >int size;
> > char * thedata;
> > MPI_Init(, );
> > MPI_Comm_size(MPI_COMM_WORLD,);
> > MPI_Comm_rank(MPI_COMM_WORLD,);
> >  thedata=getprocessid();
> >  printf(" the data is %s", thedata);
> > MPI_Finalize();
> > }
> > 
> > 
> > I get the following result:
> > 
> > ###
> > jean@sunn32:~$ mpicc pipetest2.c -o pipetest2
> > jean@sunn32:~$ mpirun -np 1 -am ft-enable-cr -mca btl ^openib  pipetest2
> > [sun32:19211] *** Process received signal ***
> > [sun32:19211] Signal: Segmentation fault (11)
> > [sun32:19211] Signal code: Address not mapped (1)
> > [sun32:19211] Failing at address: 0x4
> > [sun32:19211] [ 0] [0xb7f3c40c]
> > [sun32:19211] [ 1] /lib/libc.so.6(cfree+0x3b) [0xb796868b]
> > [sun32:19211] [ 2] 

Re: [OMPI users] low efficiency when we use --am ft-enable-cr to checkpoint

2010-03-05 Thread Joshua Hursey

On Mar 5, 2010, at 3:15 AM, 马少杰 wrote:

> Dear Sir:
> - What version of Open MPI are you using?
> my version is 1.3.4
>  - What configure options are you using?
> ./configure --with-ft=cr --enable-mpi-threads --enable-ft-thread 
> --with-blcr=$dir --with-blcr-libdir=/$dir/lib 
> --prefix=/public/mpi/openmpi134-gnu-cr --enable-mpirun-prefix-by-default
> make
> make install
>  - What MCA parameters are you using?
> mpirun -np 8 --am ft-enable-cr  -machinefile ma  xhpl
> vim $HOME/.openmpi/mca-params.conf
> # Local snapshot directory (not used in this scenario)
> crs_base_snapshot_dir=/home/me/tmp
> # Remote snapshot directory (globally mounted file system))
> snapc_base_global_snapshot_dir=/home/me/checkpoints
>  
>  
>  - Are you building from a release tarball or a SVN checkout?
> building from openmpi-1.3.4.tar.gz
>  
>  
> Now, I solve the problem successfully.
> I found that the mpirun command as
>  
> mpirun -np 8 --am ft-enable-cr  --mca opal_cr_use_thread 0  -machinefile ma  
> ./xhpl
>  
> the time cost is almost equal to the time cost by the command: mpirun -np 8  
> -machinefile ma  ./xhpl
>  
> I think it should be  a bug.

Since you have configured Open MPI to use the C/R thread (--enable-ft-thread) 
then Open MPI will start the concurrent C/R thread when you ask for C/R to be 
enabled. By default the thread polls very aggressively (waiting only 0 
microseconds, or the same as calling sched_yeild() on most systems). By turning 
it off you eliminate the contention the thread is causing on the system. There 
are two MCA parameters that control this behavior, links below:
  http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-opal_cr_thread_sleep_check
  http://osl.iu.edu/research/ft/ompi-cr/api.php#mca-opal_cr_thread_sleep_wait

I agree that the default behavior is probably too aggressive for most 
applications. However by increasing these values the user is also increasing 
the amount of time before a checkpoint can begin. In my setup I usually set:
  opal_cr_thread_sleep_wait=1000
Which will throttle down the thread when the application is in the MPI library.

You might want to play around with these MCA parameters to tune the 
aggressiveness of the C/R thread to your performance needs. In the mean time I 
will look into finding better default parameters for these options.

Cheers,
Josh


>  
>  
> 2010-03-05
> 马少杰
> 发件人: Joshua Hursey
> 发送时间: 2010-03-05  00:07:19
> 收件人: Open MPI Users
> 抄送:
> 主题: Re: [OMPI users] low efficiency when we use --am ft-enable-cr tocheckpoint
> There is some overhead involved when activating the current C/R functionality 
> in Open MPI due to the wrapping of the internal point-to-point stack. The 
> wrapper (CRCP framework) tracks the signature of each message (not the 
> buffer, so constant time for any size MPI message) so that when we need to 
> quiesce the network we know of all the outstanding messages that need to be 
> drained.
>  
> So there is an overhead, but it should not be as significant as you have 
> mentioned. I looked at some of the performance aspects in the paper at the 
> link below:
>   http://www.open-mpi.org/papers/hpdc-2009/
> Though I did not look at HPL explicitly in this paper (just NPB, GROMACS, and 
> NetPipe), I have in testing and the time difference was definitely not 2x 
> (cannot recall the exact differences at the moment).
>  
> Can you tell me a bit about your setup:
>  - What version of Open MPI are you using?
>  - What configure options are you using?
>  - What MCA parameters are you using?
>  - Are you building from a release tarball or a SVN checkout?
>  
> -- Josh
>  
>  
> On Mar 3, 2010, at 10:07 PM, 马少杰 wrote:
>  
> >  
> >  
> > 2010-03-04
> > 马少杰
> > Dear Sir:
> >I want to use blcr  and openmpi to checkpoint, now I can save check 
> > point and restart my work successfully. How erver I find the option "--am 
> > ft-enable-cr" will case large cost . For example ,  when I run my HPL job  
> > without and with the option "--am ft-enable-cr" on 4 hosts (32 process, IB 
> > network) respectively , the time costed are   8m21.180sand 16m37.732s 
> > respctively. it is should be noted that I did not save the checkpoint when 
> > I run the job, the additional cost is caused by "--am ft-enable-cr" 
> > independently. Why can the optin "--am ft-enable-cr"  case so much system  
> > cost? Is it normal? How can I solve the problem.
> >   I also test  other mpi applications, the problem still exists.   
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
>  
>  
>  
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] low efficiency when we use --am ft-enable-cr to checkpoint

2010-03-04 Thread Joshua Hursey
There is some overhead involved when activating the current C/R functionality 
in Open MPI due to the wrapping of the internal point-to-point stack. The 
wrapper (CRCP framework) tracks the signature of each message (not the buffer, 
so constant time for any size MPI message) so that when we need to quiesce the 
network we know of all the outstanding messages that need to be drained.

So there is an overhead, but it should not be as significant as you have 
mentioned. I looked at some of the performance aspects in the paper at the link 
below:
  http://www.open-mpi.org/papers/hpdc-2009/
Though I did not look at HPL explicitly in this paper (just NPB, GROMACS, and 
NetPipe), I have in testing and the time difference was definitely not 2x 
(cannot recall the exact differences at the moment).

Can you tell me a bit about your setup:
 - What version of Open MPI are you using?
 - What configure options are you using?
 - What MCA parameters are you using?
 - Are you building from a release tarball or a SVN checkout?

-- Josh


On Mar 3, 2010, at 10:07 PM, 马少杰 wrote:

>  
>  
> 2010-03-04
> 马少杰
> Dear Sir:
>I want to use blcr  and openmpi to checkpoint, now I can save check 
> point and restart my work successfully. How erver I find the option "--am 
> ft-enable-cr" will case large cost . For example ,  when I run my HPL job  
> without and with the option "--am ft-enable-cr" on 4 hosts (32 process, IB 
> network) respectively , the time costed are   8m21.180sand 16m37.732s 
> respctively. it is should be noted that I did not save the checkpoint when I 
> run the job, the additional cost is caused by "--am ft-enable-cr" 
> independently. Why can the optin "--am ft-enable-cr"  case so much system  
> cost? Is it normal? How can I solve the problem.
>   I also test  other mpi applications, the problem still exists.   
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] checkpointing multi node and multi process applications

2010-03-04 Thread Joshua Hursey

On Mar 4, 2010, at 8:17 AM, Fernando Lemos wrote:

> On Wed, Mar 3, 2010 at 10:24 PM, Fernando Lemos  wrote:
> 
>> Is there anything I can do to provide more information about this bug?
>> E.g. try to compile the code in the SVN trunk? I also have kept the
>> snapshots intact, I can tar them up and upload them somewhere in case
>> you guys need it. I can also provide the source code to the ring
>> program, but it's really the canonical ring MPI example.
>> 
> 
> I tried 1.5 (1.5a1r22754 nightly snapshot, same compilation flags).
> This time taking the checkpoint didn't generate any error message:
> 
> root@debian1:~# mpirun -am ft-enable-cr -mca btl_tcp_if_include eth1
> -np 2 --host debian1,debian2 ring
> 
 Process 1 sending 2761 to 0
 Process 1 received 2760
 Process 1 sending 2760 to 0
> root@debian1:~#
> 
> But restoring it did:
> 
> root@debian1:~# ompi-restart ompi_global_snapshot_23071.ckpt
> [debian1:23129] Error: Unable to access the path
> [/root/ompi_global_snapshot_23071.ckpt/0/opal_snapshot_1.ckpt]!
> --
> Error: The filename (opal_snapshot_1.ckpt) is invalid because either
> you have not provided a filename
>   or provided an invalid filename.
>   Please see --help for usage.
> 
> --
> --
> mpirun has exited due to process rank 1 with PID 23129 on
> node debian1 exiting improperly. There are two reasons this could occur:
> 
> 1. this process did not call "init" before exiting, but others in
> the job did. This can cause a job to hang indefinitely while it waits
> for all processes to call "init". By rule, if one process calls "init",
> then ALL processes must call "init" prior to termination.
> 
> 2. this process called "init", but exited without calling "finalize".
> By rule, all processes that call "init" MUST call "finalize" prior to
> exiting or it will be considered an "abnormal termination"
> 
> This may have caused other processes in the application to be
> terminated by signals sent by mpirun (as reported here).
> --
> root@debian1:~#
> 
> Indeed, opal_snapshot_1.ckpt does not exist exist:
> 
> root@debian1:~# find ompi_global_snapshot_23071.ckpt/
> ompi_global_snapshot_23071.ckpt/
> ompi_global_snapshot_23071.ckpt/global_snapshot_meta.data
> ompi_global_snapshot_23071.ckpt/restart-appfile
> ompi_global_snapshot_23071.ckpt/0
> ompi_global_snapshot_23071.ckpt/0/opal_snapshot_0.ckpt
> ompi_global_snapshot_23071.ckpt/0/opal_snapshot_0.ckpt/ompi_blcr_context.23073
> ompi_global_snapshot_23071.ckpt/0/opal_snapshot_0.ckpt/snapshot_meta.data
> root@debian1:~#
> 
> It can be found in debian2:
> 
> root@debian2:~# find ompi_global_snapshot_23071.ckpt/
> ompi_global_snapshot_23071.ckpt/
> ompi_global_snapshot_23071.ckpt/0
> ompi_global_snapshot_23071.ckpt/0/opal_snapshot_1.ckpt
> ompi_global_snapshot_23071.ckpt/0/opal_snapshot_1.ckpt/snapshot_meta.data
> ompi_global_snapshot_23071.ckpt/0/opal_snapshot_1.ckpt/ompi_blcr_context.6501
> root@debian2:~#

By default, Open MPI requires a shared file system to save checkpoint files. So 
by default the local snapshot is moved, since the system assumes that it is 
writing to the same directory on a shared file system. If you want to use the 
local disk staging functionality (which is known to be broken in the 1.4 
series), check out the example on the webpage below:
  http://osl.iu.edu/research/ft/ompi-cr/examples.php#uc-ckpt-local

> 
> Then I tried supplying a hostfile for ompi-run and it worked just
> fine! I thought the checkpoint included the hosts information?

We intentionally do not save the hostfile as part of the checkpoint. Typically 
folks will want to restart on different nodes than those they checkpointed on 
(such as in a batch scheduling environment). If we saved the hostfile then it 
could lead to unexpected user behavior on restart if the machines that they 
wish to restart on change.

If you need to pass a hostfile, the you can pass one to ompi-restart just as 
you would mpirun.

> 
> So I think it's fixed in 1.5. Should I try the 1.4 branch in SVN?

The file staging functionality is known to be broken in the 1.4 series at this 
time, per the ticket below:
  https://svn.open-mpi.org/trac/ompi/ticket/2139

Unfortunately the fix is likely to be both custom for the branch (since we 
redesigned the functionality for the trunk and v1.5) and fairly involved. I 
don't have the time at the moment to work on fix, but hopefully in the coming 
months I will be able to look into this issue. In the mean time, patches are 
always welcome :)

Hope that helps,
Josh


> 
> 
> Thanks a bunch,
> ___
> users mailing list
> us...@open-mpi.org
> 

Re: [OMPI users] Segfault in ompi-restart (ft-enable-cr)

2010-03-03 Thread Joshua Hursey

On Mar 3, 2010, at 3:42 PM, Fernando Lemos wrote:

> On Wed, Mar 3, 2010 at 5:31 PM, Joshua Hursey <jjhur...@open-mpi.org> wrote:
> 
>> 
>> Yes, ompi-restart should be printing a helpful message and exiting normally. 
>> Thanks for the bug report. I believe that I have seen and fixed this on a 
>> development branch making its way to the trunk. I'll make sure to move the 
>> fix to the 1.4 series once it has been applied to the trunk.
>> 
>> I filed a ticket on this if you wanted to track the issue.
>>  https://svn.open-mpi.org/trac/ompi/ticket/2329
> 
> Ah, that's great. Just wondering, do you have any idea why blcr-util
> is required? That package only contains the cr_* binaries (cr_restart,
> cr_checkpoint, cr_run) and some docs (manpages, changelog, etc.). I've
> filled a Debian bug (#572229) about making openmpi-checkpoint depend
> on blcr-util, but the package maintainer told me he found it unusual
> that ompi-restart would depend on the cr_* binaries since libcr
> supposedly provides all the functionality ompi-restart needs.
> 
> I'm about to compile OpenMPI in debug mode and take a look at the
> backtrace to see if I can understand what's going on.
> 
> Btw, this is the list of files in the blcr-util package:
> http://packages.debian.org/sid/amd64/blcr-util/filelist . As you can
> see, only cr_* binaries and docs.

Open MPI currently calls 'cr_restart' for each process it restarts, exec'ed 
from the 'opal-restart' binary (LAM/MPI also used cr_restart directly, in case 
anyone is interested). We use the internal library interface for checkpoint, 
but not restarting at this time.

If I recall correctly, it wasn't until relatively recently that BLCR added the 
ability to restart a process from a library call. We have not put in the code 
to use this functionality (though all of the framework interfaces are in place 
to do so). On my development branch I will add the ability to use the BLCR 
library interface if available. That functionality will not likely make it to 
the v1.4 release series since it is not really a bug fix, but I will plan on 
including it in the v1.5 and later releases. And just so I don't lose track of 
it, I created an enhancement ticket for this:
  https://svn.open-mpi.org/trac/ompi/ticket/2330

Cheers,
Josh

> 
>> 
>> Thanks again,
>> Josh
> 
> Thank you!
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Segfault in ompi-restart (ft-enable-cr)

2010-03-03 Thread Joshua Hursey

On Mar 2, 2010, at 9:17 AM, Fernando Lemos wrote:

> On Sun, Feb 28, 2010 at 11:11 PM, Fernando Lemos  
> wrote:
>> Hello,
>> 
>> 
>> I'm trying to come up with a fault tolerant OpenMPI setup for research
>> purposes. I'm doing some tests now, but I'm stuck with a segfault when
>> I try to restart my test program from a checkpoint.
>> 
>> My test program is the "ring" program, where messages are sent to the
>> next node in the ring N times. It's pretty simple, I can supply the
>> source code if needed. I'm running it like this:
>> 
>> # mpirun -np 4 -am ft-enable-cr ring
>> ...
> Process 1 sending 703 to 2
> Process 3 received 704
> Process 3 sending 704 to 0
> Process 3 received 703
> Process 3 sending 703 to 0
>> --
>> mpirun noticed that process rank 0 with PID 18358 on node debian1
>> exited on signal 0 (Unknown signal 0).
>> --
>> 4 total processes killed (some possibly by mpirun during cleanup)
>> 
>> That's the output when I ompi-checkpoint the mpirun PID from another 
>> terminal.
>> 
>> The checkpoint is taken just fine in maybe 1.5 seconds. I can see the
>> checkpoint directory has been created in $HOME.
>> 
>> This is what I get when I try to run ompi-restart
>> 
>> ps axroot@debian1:~# ps ax | grep mpirun
>> 18357 pts/0R+ 0:01 mpirun -np 4 -am ft-enable-cr ring
>> 18378 pts/5S+ 0:00 grep mpirun
>> root@debian1:~# ompi-checkpoint 18357
>> Snapshot Ref.:   0 ompi_global_snapshot_18357.ckpt
>> root@debian1:~# ompi-checkpoint --term 18357
>> Snapshot Ref.:   1 ompi_global_snapshot_18357.ckpt
>> root@debian1:~# ompi-restart ompi_global_snapshot_18357.ckpt
>> --
>> Error: Unable to obtain the proper restart command to restart from the
>>   checkpoint file (opal_snapshot_2.ckpt). Returned -1.
>> 
>> --
>> [debian1:18384] *** Process received signal ***
>> [debian1:18384] Signal: Segmentation fault (11)
>> [debian1:18384] Signal code: Address not mapped (1)
>> [debian1:18384] Failing at address: 0x725f725f
>> [debian1:18384] [ 0] [0xb775f40c]
>> [debian1:18384] [ 1]
>> /usr/local/lib/libopen-pal.so.0(opal_argv_free+0x33) [0xb771ea63]
>> [debian1:18384] [ 2]
>> /usr/local/lib/libopen-pal.so.0(opal_event_fini+0x30) [0xb77150a0]
>> [debian1:18384] [ 3]
>> /usr/local/lib/libopen-pal.so.0(opal_finalize+0x35) [0xb7708fa5]
>> [debian1:18384] [ 4] opal-restart [0x804908e]
>> [debian1:18384] [ 5] /lib/i686/cmov/libc.so.6(__libc_start_main+0xe5)
>> [0xb7568b55]
>> [debian1:18384] [ 6] opal-restart [0x8048fc1]
>> [debian1:18384] *** End of error message ***
>> --
>> mpirun noticed that process rank 2 with PID 18384 on node debian1
>> exited on signal 11 (Segmentat
>> --
>> 
>> I used a clean install of Debian Squeeze (testing) to make sure my
>> environment was ok. Those are the steps I took:
>> 
>> - Installed Debian Squeeze, only base packages
>> - Installed build-essential, libcr0, libcr-dev, blcr-dkms (build
>> tools, BLCR dev and run-time environment)
>> - Compiled openmpi-1.4.1
>> 
>> Note that I did compile openmpi-1.4.1 because the Debian package
>> (openmpi-checkpoint) doesn't seem to be usable at the moment. There
>> are no leftovers from any previous install of Debian packages
>> supplying OpenMPI because this is a fresh install, no openmpi package
>> had been installed before.
>> 
>> I used the following configure options:
>> 
>> # ./configure --with-ft=cr --enable-ft-thread --enable-mpi-threads
>> 
>> I also tried to add the option --with-memory-manager=none because I
>> saw an e-mail on the mailing list that described this as a possible
>> solution to an (apparently) not related problem, but the problem
>> remains the same.
>> 
>> I don't have config.log (I rm'ed the build dir), but if you think it's
>> necessary I can recompile OpenMPI and provide it.
>> 
>> Some information about the system (VirtualBox virtual machine, single
>> processor, btw):
>> 
>> Kernel version 2.6.32-trunk-686
>> 
>> root@debian1:~# lsmod | grep blcr
>> blcr   79084  0
>> blcr_imports2077  1 blcr
>> 
>> libcr (BLCR) is version 0.8.2-9.
>> 
>> gcc is version 4.4.3.
>> 
>> 
>> Please let me know of any other information you might need.
>> 
>> 
>> Thanks in advance,
>> 
> 
> Hello,
> 
> I figured it out. The problem is that the Debian package brcl-utils,
> which contains the BLCR binaries (cr_restart, cr_checkpoint, etc.)
> wasn't installed. I believe OpenMPI could perhaps show a more
> descriptive message instead of segfaulting, though? Also, you might
> want to add that information to the FAQ.
> 
> Anyways, 

Re: [OMPI users] OpenMPI checkpoint/restart on multiple nodes

2010-02-08 Thread Joshua Hursey
You can use the 'checkpoint to local disk' example to checkpoint and restart 
without access to a globally shared storage devices. There is an example on the 
website that does not use a globally mounted file system:
  http://www.osl.iu.edu/research/ft/ompi-cr/examples.php#uc-ckpt-local

What version of Open MPI are you using? This functionality is known to be 
broken on the v1.3/1.4 branches, per the ticket below:
  https://svn.open-mpi.org/trac/ompi/ticket/2139

Try the nightly snapshot of the 1.5 branch or the development trunk, and see if 
this issues still occurs.

-- Josh

On Feb 8, 2010, at 8:35 AM, Andreea Costea wrote:

> I asked this question because checkpointing with to NFS is successful, but 
> checkpointing without a mount filesystem or a shared storage throws this 
> warning:
> 
> WARNING: Could not preload specified file: File already exists. 
> Fileset: /home/andreea/checkpoints/global/ompi_global_snapshot_7426.ckpt/0 
> Host: X 
> 
> Will continue attempting to launch the process. 
> 
> 
> filem:rsh: wait_all(): Wait failed (-1) 
> [[62871,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 1054 
> 
> even if I set the mca-parameters like this:
> snapc_base_store_in_place=0
> 
> crs_base_snapshot_dir
> =/home/andreea/checkpoints/local
> 
> snapc_base_global_snapshot_dir
> =/home/andreea/checkpoints/global
> and the nodes can connect through ssh without a password. 
> 
> Thanks,
> Andreea
> 
> On Mon, Feb 8, 2010 at 12:59 PM, Andreea Costea  
> wrote:
> Hi,
> 
> Let's say I have an MPI application running on several hosts. Is there any 
> way to checkpoint this application without having a shared storage between 
> the nodes?
> I already took a look at the examples here 
> http://www.osl.iu.edu/research/ft/ompi-cr/examples.php, but it seems that in 
> both cases there is a globally mounted file system. 
> 
> Thanks,
> Andreea
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Checkpoint/Restart error

2010-01-14 Thread Joshua Hursey
On Jan 14, 2010, at 8:20 AM, Andreea Costea wrote:

> Hi,
> 
> I wanted to try the C/R feature in OpenMPI version 1.4.1 that I have 
> downloaded today. When I want to checkpoint I am having the following error 
> message:
> [[65192,0],0] ORTE_ERROR_LOG: Not found in file orte-checkpoint.c at line 399
> HNP with PID 2337 Not found! 

This looks like an error coming from the 1.3.3 install. In 1.4.1 there is no 
error at line 399, in 1.3.3 there is. Check your installation of Open MPI, I 
bet you are mixing 1.4.1 and 1.3.3, which can cause unexpected problems.

Try a clean installation of 1.4.1 and double check that 1.3.3 is not in your 
path/lib_path any longer.

-- Josh

> 
> I tried the same thing with version 1.3.3 and it works perfectly.
> 
> Any idea why?
> 
> thanks,
> Andreea
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] OpenMPI checkpoint/restart

2010-01-14 Thread Joshua Hursey

On Jan 14, 2010, at 2:50 AM, Andreea Costea wrote:

> Hei there
> 
> I have some questions regarding checkpoint/restart:
> 
> 1. Until recently I thought that ompi-restart and ompi-restart are used to 
> checkpoint a process inside an MPI application. Now I reread this and I 
> realized that actually what it does is to checkpoint the mpirun process. Does 
> this mean that if I run my application with multiple processes and on 
> multiple nodes in my network the checkpoint file will contain the states of 
> all the processes of my MPI application?

I think you slightly misread the entry. ompi-checkpoint checkpoints the entire 
MPI application, across node boundaries. It requires that the user pass the PID 
of mpirun to server as a reference point for the command. This way a user can 
run multiple mpiruns from the same machine and only checkpoint a subset of 
those.

> 2. Can I restart the application on a different node? 

Yes. If you have trouble doing this, then I would suggest following the 
directions in the BLCR FAQ entry below (it usually addressed 99% of the 
problems people have doing this):
  https://upc-bugs.lbl.gov//blcr/doc/html/FAQ.html#prelink

-- Josh

> 
> Thanks a lot,
> Andreea
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] Elementary question on openMPI application location when using PBS submission

2009-12-02 Thread Joshua Hursey
The --preload-* options to 'mpirun' currently use the ssh/scp commands (or 
rsh/rcp via an MCA parameter) to move files from the machine local to the 
'mpirun' command to the compute nodes during launch. This assumes that you have 
Open MPI already installed on all of the machines. It was an option targeted to 
users that do not wish to have an NFS or similar mount on all machines.

Torque/PBS may be faster at this depending on how they organize the staging, 
but I assume that we are essentially doing the same thing. There was a post on 
the users list a little while back discussing these options a bit more fully.

-- Josh

On Dec 1, 2009, at 3:21 PM, Belaid MOA wrote:

> I saw those options before but somehow I did not pay attention to them :(. 
> I was thinking that the copying is done automatically, so I felt the options 
> were useless but I was wrong.
> Thanks a lot Gus; that's exactly what I was looking for. I will try them then.
> 
> Best Regards.
> ~Belaid. 
> 
> > Date: Tue, 1 Dec 2009 15:14:01 -0500
> > From: g...@ldeo.columbia.edu
> > To: us...@open-mpi.org
> > Subject: Re: [OMPI users] Elementary question on openMPI application 
> > location when using PBS submission
> > 
> > Hi Belaid Moa
> > 
> > I spoke too fast, and burnt my tongue.
> > I should have double checked before speaking out.
> > I just looked up "man mpiexec" and found the options below.
> > I never used or knew about them, but you may want to try.
> > They seem to be similar to the Torque/PBS stage_in feature.
> > I would guess they use scp to copy the executable and other
> > files to the nodes, but I don't really know which copying
> > mechanism is used.
> > 
> > Gus Correa
> > -
> > Gustavo Correa
> > Lamont-Doherty Earth Observatory - Columbia University
> > Palisades, NY, 10964-8000 - USA
> > -
> > 
> > #
> > Excerpt from (OpenMPI 1.3.2) "man mpiexec":
> > #
> > 
> > --preload-binary
> > Copy the specified executable(s) to remote machines 
> > prior to
> > starting remote processes. The executables will be 
> > copied to
> > the Open MPI session directory and will be deleted 
> > upon com-
> > pletion of the job.
> > 
> > --preload-files 
> > Preload the comma separated list of files to the 
> > current
> > working directory of the remote machines where 
> > processes will
> > be launched prior to starting those processes.
> > 
> > --preload-files-dest-dir 
> > The destination directory to be used for 
> > preload-files, if
> > other than the current working directory. By 
> > default, the
> > absolute and relative paths provided by 
> > --preload-files are
> > used.
> > 
> > 
> > 
> > 
> > Gus Correa wrote:
> > > Hi Belaid Moa
> > > 
> > > Belaid MOA wrote:
> > >> Thank you very very much Gus. Does this mean that OpenMPI does not 
> > >> copy the executable from the master node to the worker nodes?
> > > 
> > > Not that I know.
> > > Making the executable available on the nodes, and any
> > > input files the program may need, is the user's responsibility,
> > > not of mpiexec.
> > > 
> > > On the other hand,
> > > Torque/PBS has a "stage_in/stage_out" feature that is supposed to
> > > copy files over to the nodes, if you want to give it a shot.
> > > See "man qsub" and look into the (numerous) "-W" option under
> > > the "stage[in,out]=file_list" sub-options.
> > > This is a relic from the old days where everything had to be on
> > > local disks on the nodes, and NFS ran over Ethernet 10/100,
> > > but it is still used by people that
> > > run MPI programs with heavy I/O, to avoid pounding on NFS or
> > > even on parallel file systems.
> > > I tried the stage_in/out feature a lng time ago,
> > > (old PBS before Torque), but it had issues.
> > > It probably works now with the newer/better
> > > versions of Torque.
> > > 
> > > However, the easy way to get this right is just to use an NFS mounted
> > > directory.
> > > 
> > >> If that's case, I will go ahead and NFS mount my working directory.
> > >>
> > > 
> > > This would make your life much easier.
> > > 
> > > My $0.02.
> > > Gus Correa
> > > -
> > > Gustavo Correa
> > > Lamont-Doherty Earth Observatory - Columbia University
> > > Palisades, NY, 10964-8000 - USA
> > > -
> > > 
> > > 
> > > 
> > > 
> > >> ~Belaid.
> > >>
> > >>
> > >> > Date: Tue, 1 Dec 2009 13:50:57 -0500
> > >> > From: g...@ldeo.columbia.edu
> > >> > To: us...@open-mpi.org
> > >> > Subject: Re: [OMPI users] Elementary question on openMPI 
> > >> application location when using PBS submission
> > >> >
> > >> > Hi Belaid MOA
> > >> >
> > >> > See this FAQ:
> > >> > 
> > >> 

Re: [OMPI users] How to build OMPI with Checkpoint/restart.

2009-09-17 Thread Joshua Hursey


On Sep 16, 2009, at 8:30 AM, Marcin Stolarek wrote:


Hi,

It seems I solved my problem. Root of the error was, that I haven't  
loaded blcr module. So I couldn't checkpoint even one therad  
application.


I am glad to hear that you have things working now.


However I stil can't find MCA:blcr in ompi_all -info, It's working.


This may have been a red-herring, sorry. I think ompi_info will only  
show the 'none' component due to the way it searches for components in  
the system. This is a bug how in the CRS selection logic plays with  
ompi_info. I will take a note/file a bug to look into fixing it.  
Unfortunately I do not have a work around other than looking in the  
install directory for the mca_crs_blcr.so file.


-- Josh



marcin

2009/9/15 Marcin Stolarek 
Hi,

I've done everythink from the beginig.:

rm  -r $ompi_install
make clean
make
make install

In $ompi_install, I've got files you mentioned:
mstol@halo2:/home/guests/mstol/openmpi/lib/openmp# ls mca_crs_bl*
mca_crs_blcr.la  mca_crs_blcr.so

but, when I try:
# ompi_info -all | grep "crs:"
mstol@halo2:/home/guests/mstol/openmpi/openmpi-1.3.3# ompi_info -- 
all | grep "crs:"

MCA crs: none (MCA v2.0, API v2.0, Component v1.3.3)
MCA crs: parameter "crs_base_verbose" (current  
value: "0", data source: default value)
MCA crs: parameter "crs" (current value: "none",  
data source: default value)
MCA crs: parameter  
"crs_none_select_warning" (current value: "0", data source: default  
value)
MCA crs: parameter "crs_none_priority" (current  
value: "0", data source: default value)


I don't have crs: blcr component.

marcin

2009/9/14 Josh Hursey 

The config.log looked fine, so I think you have fixed the configure  
problem that you previously posted about.


Though the config.log indicates that the BLCR component is scheduled  
for compile, ompi_info does not indicate that it is available. I  
suspect that the error below is because the CRS could not find any  
CRS components to select (though there should have been an error  
displayed indicating as such).


I would check your Open MPI installation to make sure that it is the  
one that you configured with. Specifically I would check to make  
sure that in the installation location there are the following files:

$install_dir/lib/openmpi/mca_crs_blcr.so
$install_dir/lib/openmpi/mca_crs_blcr.la

If that checks out, then I would remove the old installation  
directory and try reinstalling fresh.


Let me know how it goes.

-- Josh



On Sep 13, 2009, at 5:49 AM, Marcin Stolarek wrote:

I've tryed another time.  Here is what I get when trying to run  
using-1.4a1r21964 :


(terminus:~) mstol% mpirun --am ft-enable-cr ./a.out
--
It looks like opal_init failed for some reason; your parallel  
process is

likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_cr_init() failed failed
--> Returned value -1 instead of OPAL_SUCCESS
--
[terminus:06120] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file  
runtime/orte_

init.c at line 79
--
It looks like MPI_INIT failed for some reason; your parallel process  
is

likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or  
environment

problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: orte_init failed
--> Returned "Error" (-1) instead of "Success" (0)
--
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[terminus:6120] Abort before MPI_INIT completed successfully; not  
able to guaran

tee that all other processes were killed!
--
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--

I've included config.log and ompi_info --all output in attacment
LD_LIBRARY_PATH is set correctly.
Any idea?

marcin





2009/9/12 Marcin Stolarek 
Hi,
I'm trying  to compile OpenMPI with  checkpoint restart via BLCR.  
I'm not sure which path shoul I set as a value of --with-blcr option.

I'm using 1.3.3 release, which version of BLCR should I use?

I've compiled the 

Re: [MTT users] Perl Wrap Error

2007-07-06 Thread Joshua Hursey

That seemed to have done the trick.

Thanks,
Josh

On Jul 6, 2007, at 3:04 PM, Ethan Mallove wrote:


On Fri, Jul/06/2007 01:22:06PM, Joshua Hursey wrote:
Anyone seen the following error from MTT before? It looks like it  
is in the

reporter stage.

<->
shell$ /spin/home/jjhursey/testing/mtt//client/mtt --mpi-install   
--scratch

/spin/home/jjhursey/testing/scratch/20070706 --file
/spin/home/jjhursey/testing/etc/jaguar/simple-svn.ini --print-time
--verbose --debug 2>&1 1>>
/spin/home/jjhursey/testing/scratch/20070706/output.txt
This shouldn't happen at /usr/lib/perl5/5.8.3/Text/Wrap.pm line 64.
shell$
<->


"This shouldn't happen at ..." is the die message?

Try this INI [Reporter: TextFile] section:

{{{
  [Reporter: text file backup]
  module = TextFile

  textfile_filename = $phase-$section-$mpi_name-$mpi_version.txt

  # User-defined report headers/footers
  textfile_summary_header = <

The return code is: 6400

I attached the output log incase that helps, and the INI file.

-- Josh


___
mtt-users mailing list
mtt-us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/mtt-users



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/





[MTT users] Perl Wrap Error

2007-07-06 Thread Joshua Hursey
Anyone seen the following error from MTT before? It looks like it is  
in the reporter stage.


<->
shell$ /spin/home/jjhursey/testing/mtt//client/mtt --mpi-install  -- 
scratch /spin/home/jjhursey/testing/scratch/20070706 --file /spin/ 
home/jjhursey/testing/etc/jaguar/simple-svn.ini --print-time -- 
verbose --debug 2>&1 1>> /spin/home/jjhursey/testing/scratch/20070706/ 
output.txt

This shouldn't happen at /usr/lib/perl5/5.8.3/Text/Wrap.pm line 64.
shell$
<->

The return code is: 6400

I attached the output log incase that helps, and the INI file.

-- Josh


jjhursey-mtt.tar.bz2
Description: Binary data



Josh Hursey
jjhur...@open-mpi.org
http://www.open-mpi.org/