[OMPI users] Signal: Segmentation fault (11) Signal code: Address not mapped (1)

2009-09-09 Thread Jean Potsam
Dear All,
               I have installed openmpi 1.3.2 in my home directory ( 
/home/jean/openmpisof/ ) and BLCR in /usr/local/blcr. I have added the 
following in the .bashrc file

export PATH=/home/jean/openmpisof/bin/:$PATH
export LD_LIBRARY_PATH=/home/jean/openmpisof/lib/:$LD_LIBRARY_PATH

export PATH=/usr/local/blcr/bin/:$PATH
export LD_LIBRARY_PATH=/usr/local/blcr/lib:$LD_LIBRARY_PATH

I am running my application as follows:

mpirun -am ft-enable-cr -mca btl ^openib -mca snapc_base_global_snapshot_dir 
/tmp mpitest

But I get the following error when i try to checkpoint the application. 

##   
[sun06:20513] *** Process received signal ***
[sun06:20513] Signal: Segmentation fault (11)
[sun06:20513] Signal code: Address not mapped (1)
[sun06:20513] Failing at address: 0x4
[sun06:20513] [ 0] [0xb7fab40c]
[sun06:20513] [ 1] /lib/libc.so.6(cfree+0x3b) [0xb79e468b]
[sun06:20513] [ 2] /usr/local/blcr/lib/libcr.so.0(cri_info_free+0x2a) 
[0xb7b1725a]
[sun06:20513] [ 3] /usr/local/blcr/lib/libcr.so.0 [0xb7b18c72]
[sun06:20513] [ 4] /lib/libc.so.6(__libc_fork+0x186) [0xb7a0d266]
[sun06:20513] [ 5] /lib/libpthread.so.0(fork+0x14) [0xb7ac4b24]
[sun06:20513] [ 6] /home/jean/openmpisof/lib/libopen-pal.so.0 [0xb7bc2a01]
[sun06:20513] [ 7] 
/home/jean/openmpisof/lib/libopen-pal.so.0(opal_crs_blcr_checkpoint+0x187) 
[0xb7bc231b]
[sun06:20513] [ 8] 
/home/jean/openmpisof/lib/libopen-pal.so.0(opal_cr_inc_core+0xc3) [0xb7b8eb1d]
[sun06:20513] [ 9] /home/jean/openmpisof/lib/libopen-rte.so.0 [0xb7cab40f]
[sun06:20513] [10] 
/home/jean/openmpisof/lib/libopen-pal.so.0(opal_cr_test_if_checkpoint_ready+0x129)
 [0xb7b8ea2a]
[sun06:20513] [11] /home/jean/openmpisof/lib/libopen-pal.so.0 [0xb7b8f0f8]
[sun06:20513] [12] /lib/libpthread.so.0 [0xb7abbf3b]
[sun06:20513] [13] /lib/libc.so.6(clone+0x5e) [0xb7a42bee]
[sun06:20513] *** End of error message ***
###

Any help will be very appreciated.

Regards,

Jean



  

Re: [OMPI users] Messages getting lost during transmission (?)

2009-09-09 Thread Richard Treumann

Dennis

In MPI, you must complete every MPI_Isend by MPI_Wait on the request handle
(or a variant like MPI_Waitall or MPI_Test that returns TRUE).  An
un-completed MPI_Isend leaves resources tied up.

I do not know what symptom to expect from OpenMPI with this particular
application error but the one you describe is plausible.


Dick Treumann  -  MPI Team
IBM Systems & Technology Group
Dept X2ZA / MS P963 -- 2455 South Road -- Poughkeepsie, NY 12601
Tele (845) 433-7846 Fax (845) 433-8363


users-boun...@open-mpi.org wrote on 09/09/2009 11:47:12 AM:

> [image removed]
>
> [OMPI users] Messages getting lost during transmission (?)
>
> Dennis Luxen
>
> to:
>
> users
>
> 09/09/2009 11:48 AM
>
> Sent by:
>
> users-boun...@open-mpi.org
>
> Please respond to Open MPI Users
>
> Hi all,
>
> I have a very strange behaviour in a program. It seems that messages
> that are sent from one processor to another are getting lost.
>
> The problem is isolated in the attached source code. The code works as
> follows. Two processess send each other 100k request. Each request is
> answered and triggers a number of requests to the other process in
> return. As you might already suspect, the communication is asynchronous.
>
> I already debugged the application and found that at one point during
> the communication at least one of the processes does not receive any
> messages anymore and hangs in the while loop beginning in line 45.
>
> The program is started with two processes on a single machine and no
> other parameters: "mpirun -np 2 ./mpi_test2".
>
> I appreciate your help.
>
> Best wishes,
> Dennis
>
> --
> Dennis Luxen
> Universität Karlsruhe (TH)   | Fon  : +49 (721) 608-6781
> Institut für Theoretische Informatik | Fax  : +49 (721) 608-3088
> Am Fasanengarten 5, Zimmer 220   | WWW  : algo2.ira.uka.de/luxen
> D-76131 Karlsruhe, Germany   | Email: lu...@kit.edu
> 
>
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
> #include 
>
> std::ofstream output_file;
>
> enum {REQUEST_TAG=4321, ANSWER_TAG, FINISHED_TAG};
>
> typedef int Answer_type;
>
>
> int main(int argc, char *argv[])
> {
>MPI_Init (&argc, &argv);   // starts MPI
>int number_of_PEs, my_PE_ID;
>MPI_Comm_size(MPI_COMM_WORLD, &number_of_PEs);
>assert(number_of_PEs == 2);
>MPI_Comm_rank(MPI_COMM_WORLD, &my_PE_ID);
>
>std::srand(123456);
>
>int number_of_requests_to_send = 10;
>int number_of_requests_to_recv = number_of_requests_to_send;
>int number_of_answers_to_recv  = number_of_requests_to_send;
>
>std::stringstream filename;
>filename<<"output"
>int buffer[100];
>MPI_Request dummy_request;
>
>//Send the first request
>MPI_Isend(buffer, 1, MPI_INT, 1-my_PE_ID, REQUEST_TAG,
> MPI_COMM_WORLD, &dummy_request);
>number_of_requests_to_send--;
>
>int working_PEs = number_of_PEs;
>bool lack_of_work_sent = false;
>bool there_was_change = true;
>while(working_PEs > 0)
>{
>   if(there_was_change)
>   {
>  there_was_change = false;
>  std::cout
>return 0;
> }
>  Package: Open MPI abuild@build26 Distribution
> Open MPI: 1.3.2
>Open MPI SVN revision: r21054
>Open MPI release date: Apr 21, 2009
> Open RTE: 1.3.2
>Open RTE SVN revision: r21054
>Open RTE release date: Apr 21, 2009
> OPAL: 1.3.2
>OPAL SVN revision: r21054
>OPAL release date: Apr 21, 2009
> Ident string: 1.3.2
>   Prefix: /usr/lib64/mpi/gcc/openmpi
>  Configured architecture: x86_64-suse-linux-gnu
>   Configure host: build26
>Configured by: abuild
>Configured on: Tue May  5 16:03:55 UTC 2009
>   Configure host: build26
> Built by: abuild
> Built on: Tue May  5 16:18:52 UTC 2009
>   Built host: build26
>   C bindings: yes
> C++ bindings: yes
>   Fortran77 bindings: yes (all)
>   Fortran90 bindings: yes
>  Fortran90 bindings size: small
>   C compiler: gcc
>  C compiler absolute: /usr/bin/gcc
> C++ compiler: g++
>C++ compiler absolute: /usr/bin/g++
>   Fortran77 compiler: gfortran
>   Fortran77 compiler abs: /usr/bin/gfortran
>   Fortran90 compiler: gfortran
>   Fortran90 compiler abs: /usr/bin/gfortran
>  C profiling: yes
>C++ profiling: yes
>  Fortran77 profiling: yes
>  Fortran90 profiling: yes
>   C++ exceptions: no
>   Thread support: posix (mpi: no, progress: no)
>Sparse Groups: no
>   Internal debug support: no
>  MPI parameter check: runtime
> Memory profiling support: no
> Memory debugging support: no
>  libltdl support: yes
>Heterogeneous support

[OMPI users] Messages getting lost during transmission (?)

2009-09-09 Thread Dennis Luxen

Hi all,

I have a very strange behaviour in a program. It seems that messages 
that are sent from one processor to another are getting lost.


The problem is isolated in the attached source code. The code works as 
follows. Two processess send each other 100k request. Each request is 
answered and triggers a number of requests to the other process in 
return. As you might already suspect, the communication is asynchronous.


I already debugged the application and found that at one point during 
the communication at least one of the processes does not receive any 
messages anymore and hangs in the while loop beginning in line 45.


The program is started with two processes on a single machine and no 
other parameters: "mpirun -np 2 ./mpi_test2".


I appreciate your help.

Best wishes,
Dennis

--
Dennis Luxen
Universität Karlsruhe (TH)   | Fon  : +49 (721) 608-6781
Institut für Theoretische Informatik | Fax  : +49 (721) 608-3088
Am Fasanengarten 5, Zimmer 220   | WWW  : algo2.ira.uka.de/luxen
D-76131 Karlsruhe, Germany   | Email: lu...@kit.edu


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 

std::ofstream output_file;

enum {REQUEST_TAG=4321, ANSWER_TAG, FINISHED_TAG};

typedef int Answer_type;


int main(int argc, char *argv[])
{
	MPI_Init (&argc, &argv);	// starts MPI
	int number_of_PEs, my_PE_ID;
	MPI_Comm_size(MPI_COMM_WORLD, &number_of_PEs);
	assert(number_of_PEs == 2);
	MPI_Comm_rank(MPI_COMM_WORLD, &my_PE_ID);

	std::srand(123456);

	int number_of_requests_to_send = 10;
	int number_of_requests_to_recv = number_of_requests_to_send;
	int number_of_answers_to_recv  = number_of_requests_to_send;

	std::stringstream filename;
	filename<<"output"< 0)
	{
		if(there_was_change)
		{
			there_was_change = false;
			std::cout< Package: Open MPI abuild@build26 Distribution
Open MPI: 1.3.2
   Open MPI SVN revision: r21054
   Open MPI release date: Apr 21, 2009
Open RTE: 1.3.2
   Open RTE SVN revision: r21054
   Open RTE release date: Apr 21, 2009
OPAL: 1.3.2
   OPAL SVN revision: r21054
   OPAL release date: Apr 21, 2009
Ident string: 1.3.2
  Prefix: /usr/lib64/mpi/gcc/openmpi
 Configured architecture: x86_64-suse-linux-gnu
  Configure host: build26
   Configured by: abuild
   Configured on: Tue May  5 16:03:55 UTC 2009
  Configure host: build26
Built by: abuild
Built on: Tue May  5 16:18:52 UTC 2009
  Built host: build26
  C bindings: yes
C++ bindings: yes
  Fortran77 bindings: yes (all)
  Fortran90 bindings: yes
 Fortran90 bindings size: small
  C compiler: gcc
 C compiler absolute: /usr/bin/gcc
C++ compiler: g++
   C++ compiler absolute: /usr/bin/g++
  Fortran77 compiler: gfortran
  Fortran77 compiler abs: /usr/bin/gfortran
  Fortran90 compiler: gfortran
  Fortran90 compiler abs: /usr/bin/gfortran
 C profiling: yes
   C++ profiling: yes
 Fortran77 profiling: yes
 Fortran90 profiling: yes
  C++ exceptions: no
  Thread support: posix (mpi: no, progress: no)
   Sparse Groups: no
  Internal debug support: no
 MPI parameter check: runtime
Memory profiling support: no
Memory debugging support: no
 libltdl support: yes
   Heterogeneous support: no
 mpirun default --prefix: no
 MPI I/O support: yes
   MPI_WTIME support: gettimeofday
Symbol visibility support: yes
   FT Checkpoint support: no  (checkpoint thread: no)
   MCA backtrace: execinfo (MCA v2.0, API v2.0, Component v1.3.2)
  MCA memory: ptmalloc2 (MCA v2.0, API v2.0, Component v1.3.2)
   MCA paffinity: linux (MCA v2.0, API v2.0, Component v1.3.2)
   MCA carto: auto_detect (MCA v2.0, API v2.0, Component v1.3.2)
   MCA carto: file (MCA v2.0, API v2.0, Component v1.3.2)
   MCA maffinity: first_use (MCA v2.0, API v2.0, Component v1.3.2)
   MCA timer: linux (MCA v2.0, API v2.0, Component v1.3.2)
 MCA installdirs: env (MCA v2.0, API v2.0, Component v1.3.2)
 MCA installdirs: config (MCA v2.0, API v2.0, Component v1.3.2)
 MCA dpm: orte (MCA v2.0, API v2.0, Component v1.3.2)
  MCA pubsub: orte (MCA v2.0, API v2.0, Component v1.3.2)
   MCA allocator: basic (MCA v2.0, API v2.0, Component v1.3.2)
   MCA allocator: bucket (MCA v2.0, API v2.0, Component v1.3.2)
MCA coll: basic (MCA v2.0, API v2.0, Component v1.3.2)
MCA coll: hierarch (MCA v2.0, API v2.0, Component v1.3.2)
MCA coll: inter (MCA v2.0, API v2.0, Component v1.3.2)
MCA coll: self (MCA v2.0, API v2.0, Component v1.3.2)
MCA coll: sm (MCA v2.0, API v2.0, Component v1

Re: [OMPI users] [OMPI devel] Error message improvement

2009-09-09 Thread George Bosilca
__func__ is what you should use. We take care of having it defined in  
_all_ cases. If the compiler doesn't support it we define it manually  
(to __FUNCTION__ or to __FILE__ in the worst case), so it is always  
available (even if it doesn't contain what one might expect such in  
the case of __FILE__).


  george.

On Sep 9, 2009, at 14:16 , Lenny Verkhovsky wrote:


Hi All,
does C99 complient compiler is something unusual
or is there a policy among OMPI developers/users that prevent me f
rom using __func__  instead of hardcoded strings in the code ?
Thanks.
Lenny.

On Wed, Sep 9, 2009 at 1:48 PM, Nysal Jan  wrote:
__FUNCTION__ is not portable.
__func__ is but it needs a C99 compliant compiler.

--Nysal


On Tue, Sep 8, 2009 at 9:06 PM, Lenny Verkhovsky > wrote:

fixed in r21952
thanks.

On Tue, Sep 8, 2009 at 5:08 PM, Arthur Huillet > wrote:

Lenny Verkhovsky wrote:
Why not using __FUNCTION__  in all our error messages ???

Sounds good, this way the function names are always correct.


--
Greetings, A. Huillet

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel


___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel

___
devel mailing list
de...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/devel




Re: [OMPI users] [OMPI devel] Error message improvement

2009-09-09 Thread Lenny Verkhovsky
Hi All,
does C99 complient compiler is something unusual
or is there a policy among OMPI developers/users that prevent me f
rom using __func__  instead of hardcoded strings in the code ?
Thanks.
Lenny.

On Wed, Sep 9, 2009 at 1:48 PM, Nysal Jan  wrote:

> __FUNCTION__ is not portable.
> __func__ is but it needs a C99 compliant compiler.
>
> --Nysal
>
> On Tue, Sep 8, 2009 at 9:06 PM, Lenny Verkhovsky <
> lenny.verkhov...@gmail.com> wrote:
>
>> fixed in r21952
>> thanks.
>>
>> On Tue, Sep 8, 2009 at 5:08 PM, Arthur Huillet 
>> wrote:
>>
>>> Lenny Verkhovsky wrote:
>>>
 Why not using __FUNCTION__  in all our error messages ???

>>>
>>> Sounds good, this way the function names are always correct.
>>>
>>> --
>>> Greetings, A. Huillet
>>>
>>> ___
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>>
>>
>>
>> ___
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>
>
>
> ___
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>


Re: [OMPI users] SVD with mpi

2009-09-09 Thread Yann JOBIC

Attila Börcs wrote:

Hi Everyone,

I'd like to achieve singular value decomposition with mpi. I heard 
about Lanczos algorith and some different kind of algorith for svd, 
but I need some help about this theme. Knows anybody some usable code 
or tutorial about parallel svd?


Best Regards,

Attila

If you need a full decomposition, scalapack is the best.
Otherwise, you may take a look at SLEPc (which use the PETSc framework)


Yann


Re: [OMPI users] SVD with mpi

2009-09-09 Thread Terry Frankcombe

Take a look at http://www.netlib.org/scalapack/

Ciao
Terry


On Tue, 2009-09-08 at 13:55 +0200, Attila Börcs wrote:
> Hi Everyone, 
> 
> I'd like to achieve singular value decomposition with mpi. I heard
> about Lanczos algorith and some different kind of algorith for svd,
> but I need some help about this theme. Knows anybody some usable code
> or tutorial about parallel svd?
> 
> Best Regards, 
> 
> Attila
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users