On Sep 16, 2009, at 8:30 AM, Marcin Stolarek wrote:

Hi,

It seems I solved my problem. Root of the error was, that I haven't loaded blcr module. So I couldn't checkpoint even one therad application.

I am glad to hear that you have things working now.

However I stil can't find MCA:blcr in ompi_all -info, It's working.

This may have been a red-herring, sorry. I think ompi_info will only show the 'none' component due to the way it searches for components in the system. This is a bug how in the CRS selection logic plays with ompi_info. I will take a note/file a bug to look into fixing it. Unfortunately I do not have a work around other than looking in the install directory for the mca_crs_blcr.so file.

-- Josh


marcin

2009/9/15 Marcin Stolarek <ms...@icm.edu.pl>
Hi,

I've done everythink from the beginig.:

rm  -r $ompi_install
make clean
make
make install

In $ompi_install, I've got files you mentioned:
mstol@halo2:/home/guests/mstol/openmpi/lib/openmp# ls mca_crs_bl*
mca_crs_blcr.la  mca_crs_blcr.so

but, when I try:
# ompi_info -all | grep "crs:"
mstol@halo2:/home/guests/mstol/openmpi/openmpi-1.3.3# ompi_info -- all | grep "crs:"
                MCA crs: none (MCA v2.0, API v2.0, Component v1.3.3)
MCA crs: parameter "crs_base_verbose" (current value: "0", data source: default value) MCA crs: parameter "crs" (current value: "none", data source: default value) MCA crs: parameter "crs_none_select_warning" (current value: "0", data source: default value) MCA crs: parameter "crs_none_priority" (current value: "0", data source: default value)

I don't have crs: blcr component.

marcin

2009/9/14 Josh Hursey <jjhur...@open-mpi.org>

The config.log looked fine, so I think you have fixed the configure problem that you previously posted about.

Though the config.log indicates that the BLCR component is scheduled for compile, ompi_info does not indicate that it is available. I suspect that the error below is because the CRS could not find any CRS components to select (though there should have been an error displayed indicating as such).

I would check your Open MPI installation to make sure that it is the one that you configured with. Specifically I would check to make sure that in the installation location there are the following files:
$install_dir/lib/openmpi/mca_crs_blcr.so
$install_dir/lib/openmpi/mca_crs_blcr.la

If that checks out, then I would remove the old installation directory and try reinstalling fresh.

Let me know how it goes.

-- Josh



On Sep 13, 2009, at 5:49 AM, Marcin Stolarek wrote:

I've tryed another time. Here is what I get when trying to run using-1.4a1r21964 :

(terminus:~) mstol% mpirun --am ft-enable-cr ./a.out
--------------------------------------------------------------------------
It looks like opal_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during opal_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

opal_cr_init() failed failed
--> Returned value -1 instead of OPAL_SUCCESS
--------------------------------------------------------------------------
[terminus:06120] [[INVALID],INVALID] ORTE_ERROR_LOG: Error in file runtime/orte_
init.c at line 79
--------------------------------------------------------------------------
It looks like MPI_INIT failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during MPI_INIT; some of which are due to configuration or environment
problems.  This failure appears to be an internal failure; here's some
additional information (which may only be relevant to an Open MPI
developer):

ompi_mpi_init: orte_init failed
--> Returned "Error" (-1) instead of "Success" (0)
--------------------------------------------------------------------------
*** An error occurred in MPI_Init
*** before MPI was initialized
*** MPI_ERRORS_ARE_FATAL (your MPI job will now abort)
[terminus:6120] Abort before MPI_INIT completed successfully; not able to guaran
tee that all other processes were killed!
--------------------------------------------------------------------------
mpirun noticed that the job aborted, but has no info as to the process
that caused that situation.
--------------------------------------------------------------------------

I've included config.log and ompi_info --all output in attacment
LD_LIBRARY_PATH is set correctly.
Any idea?

marcin





2009/9/12 Marcin Stolarek <ms...@icm.edu.pl>
Hi,
I'm trying to compile OpenMPI with checkpoint restart via BLCR. I'm not sure which path shoul I set as a value of --with-blcr option.
I'm using 1.3.3 release, which version of BLCR should I use?

I've compiled the newest version of BLCR with --prefix=$BLCR, and I've putten as a option to openmpi configure --with-blcr=$BLCR, but I recived:


configure:76646: checking if MCA component crs:blcr can compile
configure:76648: result: no

marcin





<info.tar.gz>_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


_______________________________________________
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

Reply via email to