[OMPI users] mpi programs
dear sir i would be grateful to you if you could send me a copy of open mpi program written in c language for the following problems 1. fast fourier transform 2. gaussian eliminatin 3. LU decompisition awaiting reply. with best regards mallikarjuna shastry
[OMPI users] error in checkpointing an mpi application
dear sir i am sending the details as follows 1. i am using openmpi-1.3.3 and blcr 0.8.2 2. i have installed blcr 0.8.2 first under /root/MS 3. then i installed openmpi 1.3.3 under /root/MS 4 i have configured and installed open mpi as follows #./configure --with-ft=cr --enable-mpi-threads --with-blcr=/usr/local/bin --with-blcr-libdir=/usr/local/lib # make # make install then i added the following to the .bash_profile under home directory( i went to home directory by doing cd ~) /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr_imports.ko /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr.ko PATH=$PATH:/usr/local/bin MANPATH=$MANPATH:/usr/local/man LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib then i compiled and run the file arr_add.c as follows [root@localhost examples]# mpicc -o res arr_add.c [root@localhost examples]# mpirun -np 2 -am ft-enable-cr ./res 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 -- Error: The process with PID 5790 is not checkpointable. This could be due to one of the following: - An application with this PID doesn't currently exist - The application with this PID isn't checkpointable - The application with this PID isn't an OPAL application. We were looking for the named files: /tmp/opal_cr_prog_write.5790 /tmp/opal_cr_prog_read.5790 -- [localhost.localdomain:05788] local) Error: Unable to initiate the handshake with peer [[7788,1],1]. -1 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 567 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 1054 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 NOTE: the PID of mpirun is 5788 i geve the following command for taking the checkpoint [root@localhost examples]#ompi-checkpoint -s 5788 i got the following output , but it was hanging like this [localhost.localdomain:05796] Requested - Global Snapshot Reference: (null) [localhost.localdomain:05796] Pending - Global Snapshot Reference: (null) [localhost.localdomain:05796] Running - Global Snapshot Reference: (null) can anybody resolve this problem kindly rectify it. with regards mallikarjuna shastry
[OMPI users] error in checkpointing in open mpi
dear sir i am sending the details as follows 1. i am using openmpi-1.3.3 and blcr 0.8.2 2. i have installed blcr 0.8.2 first under /root/MS 3. then i installed openmpi 1.3.3 under /root/MS 4 i have configured and installed open mpi as follows #./configure --with-ft=cr --enable-mpi-threads --with-blcr=/usr/local/bin --with-blcr-libdir=/usr/local/lib # make # make install then i added the following to the .bash_profile under home directory( i went to home directory by doing cd ~) /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr_imports.ko /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr.ko PATH=$PATH:/usr/local/bin MANPATH=$MANPATH:/usr/local/man LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib then i compiled and run the file arr_add.c as follows [root@localhost examples]# mpicc -o res arr_add.c [root@localhost examples]# mpirun -np 2 -am ft-enable-cr ./res 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 -- Error: The process with PID 5790 is not checkpointable. This could be due to one of > the following: > - An application with this PID > doesn't currently exist > - The application with this PID > isn't checkpointable > - The application with this PID > isn't an OPAL application. >We were looking for the > named files: > >/tmp/opal_cr_prog_write.5790 > >/tmp/opal_cr_prog_read.5790 > -- > [localhost.localdomain:05788] local) Error: Unable to > initiate the handshake with peer [[7788,1],1]. -1 > [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: > Error in file snapc_full_global.c at line 567 > [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: > Error in file snapc_full_global.c at line 1054 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 NOTE: the PID of mpirun is 5788 i geve the following command for taking the checkpoint [root@localhost examples]#ompi-checkpoint -s 5788 i got the following output , but it was hanging like this [localhost.localdomain:05796] Requested - Global Snapshot Reference: (null) [localhost.localdomain:05796] Pending - Global Snapshot Reference: (null) [localhost.localdomain:05796] Running - Global Snapshot Reference: (null) kindly rectify it. with regards mallikarjuna shastry
[OMPI users] (no subject)
dear sir i am sending the details as follows 1. i am using openmpi-1.3.3 and blcr 0.8.2 2. i have installed blcr 0.8.2 first under /root/MS 3. then i installed openmpi 1.3.3 under /root/MS 4 i have configured and installed open mpi as follows #./configure --with-ft=cr --enable-mpi-threads --with-blcr=/usr/local/bin --with-blcr-libdir=/usr/local/lib # make # make install then i added the following to the .bash_profile under home directory( i went to home directory by doing cd ~) /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr_imports.ko /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr.ko PATH=$PATH:/usr/local/bin MANPATH=$MANPATH:/usr/local/man LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib then i compiled and run the file arr_add.c as follows [root@localhost examples]# mpicc -o res arr_add.c [root@localhost examples]# mpirun -np 2 -am ft-enable-cr ./res 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 -- Error: The process with PID 5790 is not checkpointable. This could be due to one of the following: - An application with this PID doesn't currently exist - The application with this PID isn't checkpointable - The application with this PID isn't an OPAL application. We were looking for the named files: /tmp/opal_cr_prog_write.5790 /tmp/opal_cr_prog_read.5790 -- [localhost.localdomain:05788] local) Error: Unable to initiate the handshake with peer [[7788,1],1]. -1 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 567 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 1054 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 NOTE: the PID of mpirun is 5788 i geve the following command for taking the checkpoint [root@localhost examples]#ompi-checkpoint -s 5788 i got the following output , but it was hanging like this [localhost.localdomain:05796] Requested - Global Snapshot Reference: (null) [localhost.localdomain:05796] Pending - Global Snapshot Reference: (null) [localhost.localdomain:05796] Running - Global Snapshot Reference: (null) kindly rectify it. with regards mallikarjuna shastry
[OMPI users] error in ompi-checkpoint
[root@localhost examples]# mpirun -np 4 -am ft-enable-cr ./res 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 -- Error: The process with PID 19735 is not checkpointable. This could be due to one of the following: - An application with this PID doesn't currently exist - The application with this PID isn't checkpointable - The application with this PID isn't an OPAL application. We were looking for the named files: /tmp/opal_cr_prog_write.19735 /tmp/opal_cr_prog_read.19735 -- [localhost.localdomain:19733] local) Error: Unable to initiate the handshake with peer [[17893,1],1]. -1 [localhost.localdomain:19733] [[17893,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 567 [localhost.localdomain:19733] [[17893,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 1054 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 Note: pid of mpirun is 19733
[OMPI users] open mpi 1.3 with blcr
der sir/madam, kindly advice me how do configure open mpi-1.3 with blcr-0.7.3 or blcr-0.8.2 to checkpoint open mpi programs with regards mallikarjuna shastry
[OMPI users] problem in using blcr
dear sir/madam i am not able to checkpoint the open mpi programs using BLCR. i am using openmpi 1.3.3 and blcr 0.8.2 kindly tell me how do i configure the open mpi with blcr to checkpoint my mpi programs using blcr checkpoint library thanking you with regards mallikarjuna shastry
Re: [OMPI users] users Digest, Vol 1296, Issue 6
DEAR SIR/MADAM kindly tell the commands for checkpointing and restarting of mpi programs using intel mpi i tried the following commands they did not work ompi_checkpoint ompi_restart file name of global snap shot with regards mallikarjuna shastry
[OMPI users] (no subject)
dear sir/madam what are the mpi functins used for taking checkpoint and restart within applicaion in mpi programs and where do i get these functions from ? with regards mallikarjuna shastry