[OMPI users] DMTCP: Checkpoint-Restart solution for OpenMPI

2010-01-31 Thread Kapil Arya
Hi All,

As of January 29, 2010, we recently produced a new release (1.1.3) of
DMTCP (Distributed MultiThreaded CheckPointing). Its web page is at
http://dmtcp.sourceforge.net/ . We (the developers of DMTCP) have
tried to carefully test this this version of DMTCP on OpenMPI 1.4.1,
and we believe it to be working well. We would welcome feedback from
any OpenMPI users who would care to test it on their own applications.

The DMTCP package provides an alternative solution for
checkpoint-restart of OpenMPI computations.  Using it is as simple as:
 dmtcp_checkpoint dmtcp_checkpoint mpirun ./hello_mpi
 # Manually checkpoint from any other terminal
 dmtcp_command --checkpoint
 # Execute restart script, which invokes ckpt images that were generated.
 ./dmtcp_restart_script.sh

DMTCP works by creating a separate, stateless checkpoint coordinator,
independent of OpenMPI's orterun.  All OpenMPI processes are then
checkpointed, including orterun.  At restart time, a new DMTCP
checkpoint coordinator can be used.  DMTCP is transparent and runs
entirely in user space.  There is no modification to the MPI
application binary, nor to OpenMPI nor to the operating system kernel.

DMTCP also supports a dmtcpaware interface (application-initiated
checkpoints), and numerous other features.  At this time, DMTCP
supports only the use of Ethernet (TCP/IP) and shared memory for
transport. We are looking at supporting the Infiniband transport layer
in the future.

Finally, a bit of history.  DMTCP began with a goal of checkpointing
distributed desktop applications.  We recognize thefine
checkpoint-restart solution that already exists in OpenMPI:
checkpoint-restart service on top of BLCR.  We offer DMTCP as an
alternative for some unusual situations, such as when the end user
does not have privilege to add the BLCR kernel module.  We are eager
to gain feedback from the OpenMPI community.

Thanks,
DMTCP Developers


Re: [OMPI users] Test OpenMPI on a cluster

2010-01-31 Thread Terry Frankcombe
It seems your OpenMPI installation is not PBS-aware.

Either reinstall OpenMPI configured for PBS (and then you don't even
need -np 10), or, as Constantinos says, find the PBS nodefile and pass
that to mpirun.


On Sat, 2010-01-30 at 18:45 -0800, Tim wrote:
> Hi,  
>   
> I am learning MPI on a cluster. Here is one simple example. I expect the 
> output would show response from different nodes, but they all respond from 
> the same node node062. I just wonder why and how I can actually get report 
> from different nodes to show MPI actually distributes processes to different 
> nodes? Thanks and regards!
>   
> ex1.c  
>   
> /* test of MPI */  
> #include "mpi.h"  
> #include   
> #include   
>   
> int main(int argc, char **argv)  
> {  
> char idstr[2232]; char buff[22128];  
> char processor_name[MPI_MAX_PROCESSOR_NAME];  
> int numprocs; int myid; int i; int namelen;  
> MPI_Status stat;  
>   
> MPI_Init(&argc,&argv);  
> MPI_Comm_size(MPI_COMM_WORLD,&numprocs);  
> MPI_Comm_rank(MPI_COMM_WORLD,&myid);  
> MPI_Get_processor_name(processor_name, &namelen);  
>   
> if(myid == 0)  
> {  
>   printf("WE have %d processors\n", numprocs);  
>   for(i=1;i   {  
> sprintf(buff, "Hello %d", i);  
> MPI_Send(buff, 128, MPI_CHAR, i, 0, MPI_COMM_WORLD); }  
> for(i=1;i {  
>   MPI_Recv(buff, 128, MPI_CHAR, i, 0, MPI_COMM_WORLD, &stat);  
>   printf("%s\n", buff);  
> }  
> }  
> else  
> {   
>   MPI_Recv(buff, 128, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &stat);  
>   sprintf(idstr, " Processor %d at node %s ", myid, processor_name);  
>   strcat(buff, idstr);  
>   strcat(buff, "reporting for duty\n");  
>   MPI_Send(buff, 128, MPI_CHAR, 0, 0, MPI_COMM_WORLD);  
> }  
> MPI_Finalize();  
>   
> }  
>   
> ex1.pbs  
>   
> #!/bin/sh  
> #  
> #This is an example script example.sh  
> #  
> #These commands set up the Grid Environment for your job:  
> #PBS -N ex1  
> #PBS -l nodes=10:ppn=1,walltime=1:10:00  
> #PBS -q dque  
>   
> # export OMP_NUM_THREADS=4  
>   
>  mpirun -np 10 /home/tim/courses/MPI/examples/ex1  
>   
> compile and run:
> 
> [tim@user1 examples]$ mpicc ./ex1.c -o ex1   
> [tim@user1 examples]$ qsub ex1.pbs  
> 35540.mgt  
> [tim@user1 examples]$ nano ex1.o35540  
>   
> Begin PBS Prologue Sat Jan 30 21:28:03 EST 2010 1264904883  
> Job ID: 35540.mgt  
> Username:   tim  
> Group:  Brown  
> Nodes:  node062 node063 node169 node170 node171 node172 node174 
> node175  
> node176 node177  
> End PBS Prologue Sat Jan 30 21:28:03 EST 2010 1264904883  
>   
> WE have 10 processors  
> Hello 1 Processor 1 at node node062 reporting for duty  
>   
> Hello 2 Processor 2 at node node062 reporting for duty  
>   
> Hello 3 Processor 3 at node node062 reporting for duty  
>   
> Hello 4 Processor 4 at node node062 reporting for duty  
>   
> Hello 5 Processor 5 at node node062 reporting for duty  
>   
> Hello 6 Processor 6 at node node062 reporting for duty  
>   
> Hello 7 Processor 7 at node node062 reporting for duty  
>   
> Hello 8 Processor 8 at node node062 reporting for duty  
>   
> Hello 9 Processor 9 at node node062 reporting for duty  
>   
>   
> Begin PBS Epilogue Sat Jan 30 21:28:11 EST 2010 1264904891  
> Job ID: 35540.mgt  
> Username:   tim  
> Group:  Brown  
> Job Name:   ex1  
> Session:15533  
> Limits: neednodes=10:ppn=1,nodes=10:ppn=1,walltime=01:10:00  
> Resources:  cput=00:00:00,mem=420kb,vmem=8216kb,walltime=00:00:03  
> Queue:  dque  
> Account:  
> Nodes:  node062 node063 node169 node170 node171 node172 node174 node175 
> node176  
> node177  
> Killing leftovers...  
>   
> End PBS Epilogue Sat Jan 30 21:28:11 EST 2010 1264904891  
> 
> 
> 
> 
>   
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Test OpenMPI on a cluster

2010-01-31 Thread Constantinos Makassikis

Tim wrote:
Hi,  
  
I am learning MPI on a cluster. Here is one simple example. I expect the output would show response from different nodes, but they all respond from the same node node062. I just wonder why and how I can actually get report from different nodes to show MPI actually distributes processes to different nodes? Thanks and regards!
  
ex1.c  
  
/* test of MPI */  
#include "mpi.h"  
#include   
#include   
  
int main(int argc, char **argv)  
{  
char idstr[2232]; char buff[22128];  
char processor_name[MPI_MAX_PROCESSOR_NAME];  
int numprocs; int myid; int i; int namelen;  
MPI_Status stat;  
  
MPI_Init(&argc,&argv);  
MPI_Comm_size(MPI_COMM_WORLD,&numprocs);  
MPI_Comm_rank(MPI_COMM_WORLD,&myid);  
MPI_Get_processor_name(processor_name, &namelen);  
  
if(myid == 0)  
{  
  printf("WE have %d processors\n", numprocs);  
  for(i=1;i  {  
sprintf(buff, "Hello %d", i);  
MPI_Send(buff, 128, MPI_CHAR, i, 0, MPI_COMM_WORLD); }  
for(i=1;i{  
  MPI_Recv(buff, 128, MPI_CHAR, i, 0, MPI_COMM_WORLD, &stat);  
  printf("%s\n", buff);  
}  
}  
else  
{   
  MPI_Recv(buff, 128, MPI_CHAR, 0, 0, MPI_COMM_WORLD, &stat);  
  sprintf(idstr, " Processor %d at node %s ", myid, processor_name);  
  strcat(buff, idstr);  
  strcat(buff, "reporting for duty\n");  
  MPI_Send(buff, 128, MPI_CHAR, 0, 0, MPI_COMM_WORLD);  
}  
MPI_Finalize();  
  
}  
  
ex1.pbs  
  
#!/bin/sh  
#  
#This is an example script example.sh  
#  
#These commands set up the Grid Environment for your job:  
#PBS -N ex1  
#PBS -l nodes=10:ppn=1,walltime=1:10:00  
#PBS -q dque  
  
# export OMP_NUM_THREADS=4  
  
 mpirun -np 10 /home/tim/courses/MPI/examples/ex1  
  

Try running your program with the following:

mpirun -np 10 -machinefile machines /home/tim/courses/MPI/examples/ex1 


where 'machines' is a file containing the names of your nodes (one per line)

node063
node064
...
node177


HTH,

--
Constantinos Makassikis

  
compile and run:


[tim@user1 examples]$ mpicc ./ex1.c -o ex1   
[tim@user1 examples]$ qsub ex1.pbs  
35540.mgt  
[tim@user1 examples]$ nano ex1.o35540  
  
Begin PBS Prologue Sat Jan 30 21:28:03 EST 2010 1264904883  
Job ID: 35540.mgt  
Username:   tim  
Group:  Brown  
Nodes:  node062 node063 node169 node170 node171 node172 node174 node175  
node176 node177  
End PBS Prologue Sat Jan 30 21:28:03 EST 2010 1264904883  
  
WE have 10 processors  
Hello 1 Processor 1 at node node062 reporting for duty  
  
Hello 2 Processor 2 at node node062 reporting for duty  
  
Hello 3 Processor 3 at node node062 reporting for duty  
  
Hello 4 Processor 4 at node node062 reporting for duty  
  
Hello 5 Processor 5 at node node062 reporting for duty  
  
Hello 6 Processor 6 at node node062 reporting for duty  
  
Hello 7 Processor 7 at node node062 reporting for duty  
  
Hello 8 Processor 8 at node node062 reporting for duty  
  
Hello 9 Processor 9 at node node062 reporting for duty  
  
  
Begin PBS Epilogue Sat Jan 30 21:28:11 EST 2010 1264904891  
Job ID: 35540.mgt  
Username:   tim  
Group:  Brown  
Job Name:   ex1  
Session:15533  
Limits: neednodes=10:ppn=1,nodes=10:ppn=1,walltime=01:10:00  
Resources:  cput=00:00:00,mem=420kb,vmem=8216kb,walltime=00:00:03  
Queue:  dque  
Account:  
Nodes:  node062 node063 node169 node170 node171 node172 node174 node175 node176  
node177  
Killing leftovers...  
  
End PBS Epilogue Sat Jan 30 21:28:11 EST 2010 1264904891  





  
___

users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users

  




[OMPI users] Create group in a non-collective way

2010-01-31 Thread Yiannis Papadopoulos
Hi,

In my code I need to specify for some processes to create a group.
Now, in general the way of doing that is (correct me if I'm wrong):

int ranks[] = { 1,2,3 };
int rank;
MPI_Group world_group = MPI_GROUP_NULL;
MPI_Group subgroup = MPI_GROUP_NULL;
MPI_Comm subcomm = MPI_COMM_NULL;

MPI_Comm_rank(MPI_COMM_WORLD, &rank); // local operation
MPI_Comm_group(MPI_COMM_WORLD, &world_group); // local operation
MPI_Group_incl(world_group, 3, ranks, &subgroup); // local operation
MPI_Comm_create(MPI_COMM_WORLD, subgroup, &subcomm); // collective
operation on MPI_COMM_WORLD

if (rank>0 rank<4) {
  // do something with subcomm
}

// cleanup

Is there any way to create the communicator inside the if?

Thanks