[OMPI users] (no subject)
Hi all, Running a CUDA+MPI application on a node with 2 K80 GPUs, I get the following warnings: -- WARNING: There is at least non-excluded one OpenFabrics device found, but there are no active ports detected (or Open MPI was unable to use them). This is most certainly not what you wanted. Check your cables, subnet manager configuration, etc. The openib BTL will be ignored for this job. Local host: gpu01 -- [gpu01:107262] 1 more process has sent help message help-mpi-btl-openib.txt / no active ports found [gpu01:107262] Set MCA parameter "orte_base_help_aggregate" to 0 to see all help / error messages Any idea of what is going on and how I can fix this? I am using OpenMPI 3.1.2. ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] (no subject)
That's pretty weird. I notice that you're using 3.1.0rc2. Does the same thing happen with Open MPI 3.1.3? > On Oct 31, 2018, at 9:08 PM, Dmitry N. Mikushin wrote: > > Dear all, > > ompi_info reports pml components are available: > > $ /usr/mpi/gcc/openmpi-3.1.0rc2/bin/ompi_info -a | grep pml > MCA pml: v (MCA v2.1.0, API v2.0.0, Component v3.1.0) > MCA pml: monitoring (MCA v2.1.0, API v2.0.0, Component > v3.1.0) > MCA pml: yalla (MCA v2.1.0, API v2.0.0, Component v3.1.0) > MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v3.1.0) > MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v3.1.0) > MCA pml: ucx (MCA v2.1.0, API v2.0.0, Component v3.1.0) > > However, when I'm trying to use them, mpirun gives back: > > -- > No components were able to be opened in the pml framework. > > This typically means that either no components of this type were > installed, or none of the installed components can be loaded. > Sometimes this means that shared libraries required by these > components are unable to be found/loaded. > > Host: cloudgpu6 > Framework: pml > -- > > With the strace I can see the libraries > /usr/mpi/gcc/openmpi-3.1.0rc2/lib64/openmpi/mca_pml_* are reached out by > mpirun. Then I also can see ldd does not show any unresolved dependencies for > them. > > How else could it be the that pml is not found? > > Thanks, > - Dmitry. > > > ___ > users mailing list > users@lists.open-mpi.org > https://lists.open-mpi.org/mailman/listinfo/users -- Jeff Squyres jsquy...@cisco.com ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] (no subject)
Dear all, ompi_info reports pml components are available: $ /usr/mpi/gcc/openmpi-3.1.0rc2/bin/ompi_info -a | grep pml MCA pml: v (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: monitoring (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: yalla (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: cm (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: ob1 (MCA v2.1.0, API v2.0.0, Component v3.1.0) MCA pml: ucx (MCA v2.1.0, API v2.0.0, Component v3.1.0) However, when I'm trying to use them, mpirun gives back: -- No components were able to be opened in the pml framework. This typically means that either no components of this type were installed, or none of the installed components can be loaded. Sometimes this means that shared libraries required by these components are unable to be found/loaded. Host: cloudgpu6 Framework: pml -- With the strace I can see the libraries /usr/mpi/gcc/openmpi-3.1.0rc2/lib64/openmpi/mca_pml_* are reached out by mpirun. Then I also can see ldd does not show any unresolved dependencies for them. How else could it be the that pml is not found? Thanks, - Dmitry. ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
[OMPI users] (no subject)
Hi all, I encountered a problem when I tested the performance of OpenMPI over ROCE 100Gbps. I have two servers connected with mellanox 100Gbps Connect-X4 ROCE NICs on them. I used intel mpi benchmark to test the performance of OpenMPI (1.10.3) over RDMA. I found the bandwidth of benchmark pingpong (2 ranks, every server has only one rank) could reach only 6GB/s (with openib btl). I also used osu mpi benchmark, the bandwidth could reach only 6.5GB/s. However, when I start two benchmarks at the same time, the total bandwidth can reach about 11GB/s (every server has two ranks). It seems that the CPU is the bottleneck. Obviously, the bottleneck is not memcpy. And RDMA itself ought not to comsume too much CPU resources, since the perftest of ib_write_bw can reach 11GB/s easily. Is the bandwidth limit is normal? Is there anyone know what is the real bottleneck? Thanks for your kindly help in advance. Regards, Zhaogeng ___ users mailing list users@lists.open-mpi.org https://lists.open-mpi.org/mailman/listinfo/users
Re: [OMPI users] (no subject)
: ... -- 2 total processes failed to start [se01.grid.tuc.gr:19607] mca: base: close: component mmap closed [se01.grid.tuc.gr:19607] mca: base: close: unloading component mmap jb -Original Message- From: users [mailto:users-boun...@lists.open-mpi.org] On Behalf Of gil...@rist.or.jp Sent: Monday, May 15, 2017 1:47 PM To: Open MPI Users <users@lists.open-mpi.org> Subject: Re: [OMPI users] (no subject) Ioannis, ### What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.) ### Describe how Open MPI was installed (e.g., from a source/ distribution tarball, from a git clone, from an operating system distribution package, etc.) ### Please describe the system on which you are running * Operating system/version: * Computer hardware: * Network type: also, what if you mpirun --mca shmem_base_verbose 100 ... Cheers, Gilles - Original Message - Hi I am trying to run the following simple demo to a cluster of two nodes -- #include #include int main(int argc, char** argv) { MPI_Init(NULL, NULL); int world_size; MPI_Comm_size(MPI_COMM_WORLD, _size); int world_rank; MPI_Comm_rank(MPI_COMM_WORLD, _rank); char processor_name[MPI_MAX_PROCESSOR_NAME]; int name_len; MPI_Get_processor_name(processor_name, _len); printf("Hello world from processor %s, rank %d" " out of %d processors\n", processor_name, world_rank, world_size); MPI_Finalize(); } -- --- i get always the message -- -- It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS -- any hint? Ioannis Botsis ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] (no subject)
15, 2017 1:47 PM To: Open MPI Users <users@lists.open-mpi.org> Subject: Re: [OMPI users] (no subject) Ioannis, ### What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.) ### Describe how Open MPI was installed (e.g., from a source/ distribution tarball, from a git clone, from an operating system distribution package, etc.) ### Please describe the system on which you are running * Operating system/version: * Computer hardware: * Network type: also, what if you mpirun --mca shmem_base_verbose 100 ... Cheers, Gilles - Original Message - > Hi > > I am trying to run the following simple demo to a cluster of two nodes > > -- > #include > #include > > int main(int argc, char** argv) { > MPI_Init(NULL, NULL); > > int world_size; > MPI_Comm_size(MPI_COMM_WORLD, _size); > > int world_rank; > MPI_Comm_rank(MPI_COMM_WORLD, _rank); > > char processor_name[MPI_MAX_PROCESSOR_NAME]; > int name_len; > MPI_Get_processor_name(processor_name, _len); > > printf("Hello world from processor %s, rank %d" " out of %d > processors\n", processor_name, world_rank, world_size); > > MPI_Finalize(); > } > -- --- > > i get always the message > > -- -- > It looks like opal_init failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during opal_init; some of which are due to configuration or > environment problems. This failure appears to be an internal failure; > here's some additional information (which may only be relevant to an > Open MPI developer): > >opal_shmem_base_select failed >--> Returned value -1 instead of OPAL_SUCCESS > -- > > any hint? > > Ioannis Botsis > > > > ___ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] (no subject)
Ioannis, ### What version of Open MPI are you using? (e.g., v1.10.3, v2.1.0, git branch name and hash, etc.) ### Describe how Open MPI was installed (e.g., from a source/ distribution tarball, from a git clone, from an operating system distribution package, etc.) ### Please describe the system on which you are running * Operating system/version: * Computer hardware: * Network type: also, what if you mpirun --mca shmem_base_verbose 100 ... Cheers, Gilles - Original Message - > Hi > > I am trying to run the following simple demo to a cluster of two nodes > > -- > #include > #include > > int main(int argc, char** argv) { > MPI_Init(NULL, NULL); > > int world_size; > MPI_Comm_size(MPI_COMM_WORLD, _size); > > int world_rank; > MPI_Comm_rank(MPI_COMM_WORLD, _rank); > > char processor_name[MPI_MAX_PROCESSOR_NAME]; > int name_len; > MPI_Get_processor_name(processor_name, _len); > > printf("Hello world from processor %s, rank %d" " out of %d > processors\n", processor_name, world_rank, world_size); > > MPI_Finalize(); > } > -- --- > > i get always the message > > -- -- > It looks like opal_init failed for some reason; your parallel process is > likely to abort. There are many reasons that a parallel process can > fail during opal_init; some of which are due to configuration or > environment problems. This failure appears to be an internal failure; > here's some additional information (which may only be relevant to an > Open MPI developer): > >opal_shmem_base_select failed >--> Returned value -1 instead of OPAL_SUCCESS > -- > > any hint? > > Ioannis Botsis > > > > ___ > users mailing list > users@lists.open-mpi.org > https://rfd.newmexicoconsortium.org/mailman/listinfo/users > ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
[OMPI users] (no subject)
Hi I am trying to run the following simple demo to a cluster of two nodes -- #include #include int main(int argc, char** argv) { MPI_Init(NULL, NULL); int world_size; MPI_Comm_size(MPI_COMM_WORLD, _size); int world_rank; MPI_Comm_rank(MPI_COMM_WORLD, _rank); char processor_name[MPI_MAX_PROCESSOR_NAME]; int name_len; MPI_Get_processor_name(processor_name, _len); printf("Hello world from processor %s, rank %d" " out of %d processors\n", processor_name, world_rank, world_size); MPI_Finalize(); } - i get always the message It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_shmem_base_select failed --> Returned value -1 instead of OPAL_SUCCESS -- any hint? Ioannis Botsis ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
[OMPI users] (no subject)
Hi, Is there any idea about the following error? On that node, there are 15 empty cores. Regards, Mahmood ___ users mailing list users@lists.open-mpi.org https://rfd.newmexicoconsortium.org/mailman/listinfo/users
Re: [OMPI users] (no subject)
Hi, it seems that your ompi was compiled with ofed ver X but running on ofed ver Y. X and Y are incompatible. On Mon, Feb 22, 2016 at 8:18 PM, Mark Potterwrote: > I am usually able to find the answer to my problems by searching the > archive but I've run up against one that I can't suss out. > > bison-opt: relocation error: > /home/pbme002/opt/gcc-4.8.2-tpls/openmpi-1.8.4/lib/libmpi.so.1: symbol > rdma_get_src_port, version RDMACM_1.0 not defined in file librdmacm.so.1 > with link time reference > > There is the error I am getting, the problem is that it's not consistent. > This happens to a random few jobs in a series of the same job on different > data sets. The ones that fail and produce the error run fine when a second > attempt is made. I am the admin for this cluster and the user is using > their own compiled OpenMPI and not the system OpenMPI so I can't say for > certain that it was compiled correctly but it strikes me as odd that jobs > would fail with the above error but run perfectly fine when a second > attempt is made. > > I'm looking for any help sussing out what could be causing this issue. > > Regards, > > Mark L. Potter > > > > ___ > users mailing list > us...@open-mpi.org > Subscription: http://www.open-mpi.org/mailman/listinfo.cgi/users > Link to this post: > http://www.open-mpi.org/community/lists/users/2016/02/28565.php > -- Kind Regards, M.
[OMPI users] (no subject)
I am usually able to find the answer to my problems by searching the archive but I've run up against one that I can't suss out. bison-opt: relocation error: /home/pbme002/opt/gcc-4.8.2-tpls/openmpi-1.8.4/lib/libmpi.so.1: symbol rdma_get_src_port, version RDMACM_1.0 not defined in file librdmacm.so.1 with link time reference There is the error I am getting, the problem is that it's not consistent. This happens to a random few jobs in a series of the same job on different data sets. The ones that fail and produce the error run fine when a second attempt is made. I am the admin for this cluster and the user is using their own compiled OpenMPI and not the system OpenMPI so I can't say for certain that it was compiled correctly but it strikes me as odd that jobs would fail with the above error but run perfectly fine when a second attempt is made. I'm looking for any help sussing out what could be causing this issue. Regards, Mark L. Potter
[OMPI users] (no subject)
??
[OMPI users] (no subject)
I install openmpi-1.6.5 , decide to install new version to support Java openmpi so i chosse openmpi-1.8.2 , configure it with following : $./configure --enable-mpi-java --with-jdk-bindir=/usr/jdk7/bin --with-jdk-headers=/usr/jdk6/include --prefix=/usr/openmpi-1.8.2 and it's ok with no errors but when i install using $make all install, i have the following attached error.docx Description: MS-Word 2007 document
Re: [OMPI users] (no subject)
As I said, the degree of impact depends on the messaging pattern. If rank A typically sends/recvs with rank A+!, then you won't see much difference. However, if rank A typically sends/recvs with rank N-A, where N=#ranks in job, then you'll see a very large difference. You might try simply changing the mapping pattern - e.g., add -bynode to your cmd line. This would make it run faster if it followed the latter example. On Nov 2, 2013, at 12:40 AM, San Bwrote: > Yes MM... But here a single node has 16cores not 64 cores. > The 1st two jobs were with OMPI-1.4.5. > 16 cores of single node - 3692.403 > 16 cores on two nodes (8 cores per node) - 12338.809 > > The 1st two jobs were with OMPI-1.6.5. > 16 cores of single node - 3547.879 > 16 cores on two nodes (8 cores per node) - 5527.320 > > As others said, due to shared memory communication the single node job > is running faster, but I was expecting a slight difference between 1 & 2 > nodes - which is taking 60% more time here. > > > > On Thu, Oct 31, 2013 at 8:19 PM, Ralph Castain wrote: > Yes, though the degree of impact obviously depends on the messaging pattern > of the app. > > On Oct 31, 2013, at 2:50 AM, MM wrote: > >> Of course, by this you mean, with the same total number of nodes, for e.g. >> 64 process on 1 node using shared mem, vs 64 processes spread over 2 nodes >> (32 each for e.g.)? >> >> >> On 29 October 2013 14:37, Ralph Castain wrote: >> As someone previously noted, apps will always run slower on multiple nodes >> vs everything on a single node due to the shared memory vs IB differences. >> Nothing you can do about that one. >> ___ >> >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] (no subject)
Yes MM... But here a single node has 16cores not 64 cores. The 1st two jobs were with OMPI-1.4.5. 16 cores of single node - 3692.403 16 cores on two nodes (8 cores per node) - 12338.809 The 1st two jobs were with OMPI-1.6.5. 16 cores of single node - 3547.879 16 cores on two nodes (8 cores per node) - 5527.320 As others said, due to shared memory communication the single node job is running faster, but I was expecting a slight difference between 1 & 2 nodes - which is taking 60% more time here. On Thu, Oct 31, 2013 at 8:19 PM, Ralph Castainwrote: > Yes, though the degree of impact obviously depends on the messaging > pattern of the app. > > On Oct 31, 2013, at 2:50 AM, MM wrote: > > Of course, by this you mean, with the same total number of nodes, for e.g. > 64 process on 1 node using shared mem, vs 64 processes spread over 2 nodes > (32 each for e.g.)? > > > On 29 October 2013 14:37, Ralph Castain wrote: > >> As someone previously noted, apps will always run slower on multiple >> nodes vs everything on a single node due to the shared memory vs IB >> differences. Nothing you can do about that one. >> > ___ > > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] (no subject)
Yes, though the degree of impact obviously depends on the messaging pattern of the app. On Oct 31, 2013, at 2:50 AM, MMwrote: > Of course, by this you mean, with the same total number of nodes, for e.g. 64 > process on 1 node using shared mem, vs 64 processes spread over 2 nodes (32 > each for e.g.)? > > > On 29 October 2013 14:37, Ralph Castain wrote: > As someone previously noted, apps will always run slower on multiple nodes vs > everything on a single node due to the shared memory vs IB differences. > Nothing you can do about that one. > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] (no subject)
Of course, by this you mean, with the same total number of nodes, for e.g. 64 process on 1 node using shared mem, vs 64 processes spread over 2 nodes (32 each for e.g.)? On 29 October 2013 14:37, Ralph Castainwrote: > As someone previously noted, apps will always run slower on multiple nodes > vs everything on a single node due to the shared memory vs IB differences. > Nothing you can do about that one. >
Re: [OMPI users] (no subject)
I don't think it's a bug in OMPI, but more likely reflects improvements in the default collective algorithms. If you want to further improve performance, you should bind your processes to a core (if your application isn't threaded) or to a socket (if threaded). As someone previously noted, apps will always run slower on multiple nodes vs everything on a single node due to the shared memory vs IB differences. Nothing you can do about that one. On Oct 28, 2013, at 10:36 PM, San Bwrote: > As discussed earlier, the executable which was compiled with > OpenMPI-1.4.5 gave very low performance of 12338.809 seconds when job > executed on two nodes(8 cores per node). The same job run on single node(all > 16cores) got executed in just 3692.403 seconds. Now I compiled the > application with OpenMPI-1.6.5 and got executed in 5527.320 seconds on two > nodes. > > Is this a performance gain with OMPI-1.6.5 over OMPI-1.4.5 or an issue > with OPENMPI itself? > > > On Tue, Oct 15, 2013 at 5:32 PM, San B wrote: > Hi, > > As per your instruction, I did the profiling of the application with > mpiP. Following is the difference between the two runs: > > Run 1: 16 mpi processes on single node > > @--- MPI Time (seconds) --- > --- > TaskAppTimeMPITime MPI% >0 3.61e+0366118.32 >1 3.61e+0362717.37 >2 3.61e+0370019.39 >3 3.61e+0366518.41 >4 3.61e+0370219.45 >5 3.61e+0370319.48 >6 3.61e+0374020.50 >7 3.61e+0376321.14 > ... > ... > > Run 2: 16 mpi processes on two nodes - 8 mpi processes per node > > @--- MPI Time (seconds) --- > --- > TaskAppTimeMPITime MPI% >0 1.27e+04 1.06e+0484.14 >1 1.27e+04 1.07e+0484.34 >2 1.27e+04 1.07e+0484.20 >3 1.27e+04 1.07e+0484.20 >4 1.27e+04 1.07e+0484.22 >5 1.27e+04 1.07e+0484.25 >6 1.27e+04 1.06e+0484.02 >7 1.27e+04 1.07e+0484.35 >8 1.27e+04 1.07e+0484.29 > > > The time spent for MPI functions in run 1 is less than 20%, where > as it is more than 80% in the run 2. For more details, I've attached both > output files. Please go thru these files and suggest what optimization we can > do with OpenMPI or Intel MKL. > > Thanks > > > On Mon, Oct 7, 2013 at 12:15 PM, San B wrote: > Hi, > > I'm facing a performance issue with a scientific application(Fortran). The > issue is, it runs faster on single node but runs very slow on multiple nodes. > For example, a 16 core job on single node finishes in 1hr 2mins, but the same > job on two nodes (i.e. 8 cores per node & remaining 8 cores kept free) takes > 3hr 20mins. The code is compiled with ifort-13.1.1, openmpi-1.4.5 and intel > MKL libraries - lapack, blas, scalapack, blacs & fftw. What could be the > problem here with? > > Is it possible to do any tuning in OpenMPI? FY More info: The cluster has > Intel Sandybridge processor (E5-2670), infiniband and Hyperthreading is > Enabled. Jobs are submitted thru LSF scheduler. > > Does HyperThreading causing any problem here? > > > Thanks > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] (no subject)
As discussed earlier, the executable which was compiled with OpenMPI-1.4.5 gave very low performance of 12338.809 seconds when job executed on two nodes(8 cores per node). The same job run on single node(all 16cores) got executed in just 3692.403 seconds. Now I compiled the application with OpenMPI-1.6.5 and got executed in 5527.320 seconds on two nodes. Is this a performance gain with OMPI-1.6.5 over OMPI-1.4.5 or an issue with OPENMPI itself? On Tue, Oct 15, 2013 at 5:32 PM, San Bwrote: > Hi, > > As per your instruction, I did the profiling of the application with > mpiP. Following is the difference between the two runs: > > Run 1: 16 mpi processes on single node > > @--- MPI Time (seconds) --- > --- > TaskAppTimeMPITime MPI% >0 3.61e+0366118.32 >1 3.61e+0362717.37 >2 3.61e+0370019.39 >3 3.61e+0366518.41 >4 3.61e+0370219.45 >5 3.61e+0370319.48 >6 3.61e+0374020.50 >7 3.61e+0376321.14 > ... > ... > > Run 2: 16 mpi processes on two nodes - 8 mpi processes per node > > @--- MPI Time (seconds) --- > --- > TaskAppTimeMPITime MPI% >0 1.27e+04 1.06e+0484.14 >1 1.27e+04 1.07e+0484.34 >2 1.27e+04 1.07e+0484.20 >3 1.27e+04 1.07e+0484.20 >4 1.27e+04 1.07e+0484.22 >5 1.27e+04 1.07e+0484.25 >6 1.27e+04 1.06e+0484.02 >7 1.27e+04 1.07e+0484.35 >8 1.27e+04 1.07e+0484.29 > > > The time spent for MPI functions in run 1 is less than 20%, > where as it is more than 80% in the run 2. For more details, I've attached > both output files. Please go thru these files and suggest what optimization > we can do with OpenMPI or Intel MKL. > > Thanks > > > On Mon, Oct 7, 2013 at 12:15 PM, San B wrote: > >> Hi, >> >> I'm facing a performance issue with a scientific application(Fortran). >> The issue is, it runs faster on single node but runs very slow on multiple >> nodes. For example, a 16 core job on single node finishes in 1hr 2mins, but >> the same job on two nodes (i.e. 8 cores per node & remaining 8 cores kept >> free) takes 3hr 20mins. The code is compiled with ifort-13.1.1, >> openmpi-1.4.5 and intel MKL libraries - lapack, blas, scalapack, blacs & >> fftw. What could be the problem here with? >> Is it possible to do any tuning in OpenMPI? FY More info: The cluster has >> Intel Sandybridge processor (E5-2670), infiniband and Hyperthreading is >> Enabled. Jobs are submitted thru LSF scheduler. >> >> Does HyperThreading causing any problem here? >> >> >> Thanks >> > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] (no subject)
Hi, As per your instruction, I did the profiling of the application with mpiP. Following is the difference between the two runs: Run 1: 16 mpi processes on single node @--- MPI Time (seconds) --- --- TaskAppTimeMPITime MPI% 0 3.61e+0366118.32 1 3.61e+0362717.37 2 3.61e+0370019.39 3 3.61e+0366518.41 4 3.61e+0370219.45 5 3.61e+0370319.48 6 3.61e+0374020.50 7 3.61e+0376321.14 ... ... Run 2: 16 mpi processes on two nodes - 8 mpi processes per node @--- MPI Time (seconds) --- --- TaskAppTimeMPITime MPI% 0 1.27e+04 1.06e+0484.14 1 1.27e+04 1.07e+0484.34 2 1.27e+04 1.07e+0484.20 3 1.27e+04 1.07e+0484.20 4 1.27e+04 1.07e+0484.22 5 1.27e+04 1.07e+0484.25 6 1.27e+04 1.06e+0484.02 7 1.27e+04 1.07e+0484.35 8 1.27e+04 1.07e+0484.29 The time spent for MPI functions in run 1 is less than 20%, where as it is more than 80% in the run 2. For more details, I've attached both output files. Please go thru these files and suggest what optimization we can do with OpenMPI or Intel MKL. Thanks On Mon, Oct 7, 2013 at 12:15 PM, San Bwrote: > Hi, > > I'm facing a performance issue with a scientific application(Fortran). > The issue is, it runs faster on single node but runs very slow on multiple > nodes. For example, a 16 core job on single node finishes in 1hr 2mins, but > the same job on two nodes (i.e. 8 cores per node & remaining 8 cores kept > free) takes 3hr 20mins. The code is compiled with ifort-13.1.1, > openmpi-1.4.5 and intel MKL libraries - lapack, blas, scalapack, blacs & > fftw. What could be the problem here with? > Is it possible to do any tuning in OpenMPI? FY More info: The cluster has > Intel Sandybridge processor (E5-2670), infiniband and Hyperthreading is > Enabled. Jobs are submitted thru LSF scheduler. > > Does HyperThreading causing any problem here? > > > Thanks > mpi-App-profile-1node-16perNode.mpiP Description: Binary data mpi-App-profile-2Nodes-8perNode.mpiP Description: Binary data
Re: [OMPI users] (no subject)
Hi, When all processes run on the same node they communicate via shared memory which delivers both high bandwidth and low latency. InfiniBand is slower and more latent than shared memory. Your parallel algorithm might simply be very latency sensitive and you should profile it with something like mpiP or Vampir/VampirTrace in order to find why and only then try to further tune Open MPI. Hope that helps, Hristo From: users [mailto:users-boun...@open-mpi.org] On Behalf Of San B Sent: Monday, October 07, 2013 8:46 AM To: OpenMPI ML Subject: [OMPI users] (no subject) Hi, I'm facing a performance issue with a scientific application(Fortran). The issue is, it runs faster on single node but runs very slow on multiple nodes. For example, a 16 core job on single node finishes in 1hr 2mins, but the same job on two nodes (i.e. 8 cores per node & remaining 8 cores kept free) takes 3hr 20mins. The code is compiled with ifort-13.1.1, openmpi-1.4.5 and intel MKL libraries - lapack, blas, scalapack, blacs & fftw. What could be the problem here with? Is it possible to do any tuning in OpenMPI? FY More info: The cluster has Intel Sandybridge processor (E5-2670), infiniband and Hyperthreading is Enabled. Jobs are submitted thru LSF scheduler. Does HyperThreading causing any problem here? Thanks -- Hristo Iliev, PhD High Performance Computing Team RWTH Aachen University, Center for Computing and Communication Rechen- und Kommunikationszentrum der RWTH Aachen Seffenter Weg 23, D 52074 Aachen (Germany) Phone: +49 241 80 24367 Fax/UMS: +49 241 80 624367 smime.p7s Description: S/MIME cryptographic signature
Re: [OMPI users] (no subject)
Hi, Am 07.10.2013 um 08:45 schrieb San B: > I'm facing a performance issue with a scientific application(Fortran). The > issue is, it runs faster on single node but runs very slow on multiple nodes. > For example, a 16 core job on single node finishes in 1hr 2mins, but the same > job on two nodes (i.e. 8 cores per node & remaining 8 cores kept free) takes > 3hr 20mins. The code is compiled with ifort-13.1.1, openmpi-1.4.5 and intel > MKL libraries - lapack, blas, scalapack, blacs & fftw. What could be the > problem here with? How do you provide a list of hosts it should use to the application? Maybe it's now just running on only one machine - and/or can make use only of local OpenMP inside MKL (yes, OpenMP here which is bound to run on a single machine only). -- Reuti PS: Do you have 16 real cores or 8 plus Hyperthreading? > Is it possible to do any tuning in OpenMPI? FY More info: The cluster has > Intel Sandybridge processor (E5-2670), infiniband and Hyperthreading is > Enabled. Jobs are submitted thru LSF scheduler. > > Does HyperThreading causing any problem here? > > > Thanks > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] (no subject)
Hi, I'm facing a performance issue with a scientific application(Fortran). The issue is, it runs faster on single node but runs very slow on multiple nodes. For example, a 16 core job on single node finishes in 1hr 2mins, but the same job on two nodes (i.e. 8 cores per node & remaining 8 cores kept free) takes 3hr 20mins. The code is compiled with ifort-13.1.1, openmpi-1.4.5 and intel MKL libraries - lapack, blas, scalapack, blacs & fftw. What could be the problem here with? Is it possible to do any tuning in OpenMPI? FY More info: The cluster has Intel Sandybridge processor (E5-2670), infiniband and Hyperthreading is Enabled. Jobs are submitted thru LSF scheduler. Does HyperThreading causing any problem here? Thanks
Re: [OMPI users] (no subject)
Pramoda, That paper was exploring an application of a proposed extension to the MPI standard for fault tolerance purposes. By default this proposed interface is not provided by Open MPI. We have created a prototype version of Open MPI that includes this extension, and it can be found at the following website: http://fault-tolerance.org/ You should look at the interfaces in the new proposal (ULFM Specification) since MPI_Comm_validate_rank is no longer part of the proposal. You can get the same functionality through some of the new interfaces that replace it. There are some examples on that website, and in the proposal that should help you as well. Best, Josh On Mon, Nov 19, 2012 at 8:59 AM, sri pramodawrote: > Dear Sir, > I am Pramoda,PG scholar from Jadavpur Univesity,India. > I've gone through a paper "Building a Fault Tolerant MPI Application: > A Ring Communication Example".In this I found MPI_Comm_validate_rank > command. > But I didn't found this command in mpi. Hence I request you to please > send methe implementation of this command. > Thank you, > Pramoda. > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Joshua Hursey Assistant Professor of Computer Science University of Wisconsin-La Crosse http://cs.uwlax.edu/~jjhursey
[OMPI users] (no subject)
Dear Sir, I am Pramoda,PG scholar from Jadavpur Univesity,India. I've gone through a paper "Building a Fault Tolerant MPI Application: A Ring Communication Example".In this I found MPI_Comm_validate_rank command. But I didn't found this command in mpi. Hence I request you to please send me the implementation of this command. Thank you, Pramoda.
[OMPI users] (no subject)
Hello ! I had some problems . My environment BLCR= 0.8.4 , openMPI= 1.5.5 , OS= ubuntu 11.04 I have 2 Node : cuda05(Master ,it have NFS file system) , cuda07(slave ,mount Master) I had also set ~/.openmpi/mca-params.conf-> crs_base_snapshot_dir=/root/kidd_openMPI/Tmp snapc_base_global_snapshot_dir=/root/kidd_openMPI/checkpoints my configure format=./configure --prefix=/root/kidd_openMPI --with-ft=cr --enable-ft-thread --with-blcr=/usr/local/BLCR --with-blcr-libdir=/usr/local/BLCR/lib --enable-mpirun-prefix-by-default --enable-static --enable-shared --enable-opal-multi-threads; problem 1: ompi-restart on multiple Node command 01: mpirun -hostfile Hosts -am ft-enable-cr -x LD_LIBRARY_PATH -np 2 ./TEST command 02: ompi-restart ompi_global_snapshot_2892.ckpt -> I can checkpoint 2 process on multiples nodes ,but when I restart ,it can only restart on Master-Node. command 03 : ompi-restart -hostfile Hosts ompi_global_snapshot_2892.ckpt ->Error Message . I make sure BLCR is OK. -- root@cuda05:~/kidd_openMPI/checkpoints# ompi-restart -hostfile Hosts ompi_global_snapshot_2892.ckpt/ -- Error: BLCR was not able to restart the process because exec failed. Check the installation of BLCR on all of the machines in your system. The following information may be of help: Return Code : -1 BLCR Restart Command : cr_restart Restart Command Line : cr_restart /root/kidd_openMPI/checkpoints/ompi_global_snapshot_2892.ckpt/0/opal_snapshot_1.ckpt/ompi_blcr_context.2704 -- -- Error: Unable to obtain the proper restart command to restart from the checkpoint file (opal_snapshot_1.ckpt). Returned -1. Check the installation of the blcr checkpoint/restart service on all of the machines in your system.essage problem 2: ompi-migrate i can't find . How to use ompi-migrate ?
[OMPI users] (no subject)
http://whatbunny.org/web/app/_cache/02efpk.html;> http://whatbunny.org/web/app/_cache/02efpk.html
Re: [OMPI users] (no subject)
Harini, you can install OpenMPI which is packaged for your distribution of Linux, for examply on SuSE use zypper install openmpi or the equivalent on Redhat/Ubuntu You probably will not get the most up to date Openmpi version, but you will get the library paths set up in /etc/ld.so.conf.d/ and the mpi chooser installed Once you have this version of OpenMPI working properly you should compile and install your own latest version. I just checked - the latest version for SuSE 12.1 in the repository science/openSUSE is 1.4.5 On 16/03/2012, Gustavo Correa <g...@ldeo.columbia.edu> wrote: > > On Mar 16, 2012, at 8:51 AM, Addepalli, Srirangam V wrote: > >> This usually means you library path is not updated to find mpilibrarues. >> You can fix this many ways, basic two steps are >> >> 1. Identify location of your libraries (use locate, find ) >> 2. Add it to your Library path. ( export LD_LIBRARY_PATH or make changes >> in .bashrc or /etc/ld.so.conf) >> >> >> Rangam > > Hi Harini > > Rangam is right. > Indeed there is even an FAQ specific for this: > > http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path > > By the way, the FAQ are the best documentation around. > The README file is also helpful. > Worth reading both, to avoid mistakes and waste of time. > > If using bash on .profile or equivalent, add these lines: > export PATH=/my/path/to/openmpi/bin:$PATH > export LD_LIBRARY_PATH=/my/path/to/openmpi/lib:$PATH > > If using [t]csh on .[t]cshrc add these lines: > setenv PATH /my/path/to/openmpi/bin:$PATH > setenv LD_LIBRARY_PATH /my/path/to/openmpi/lib:$PATH > > with your actual path to openmpi replaced above, of course. > > I hope this helps, > Gus Correa > >> >> From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On Behalf Of >> jody [jody@gmail.com] >> Sent: Friday, March 16, 2012 4:04 AM >> To: Open MPI Users >> Subject: Re: [OMPI users] (no subject) >> >> Hi >> >> Did you run your program with mpirun? >> For example: >> mpirun -np 4 ./a.out >> >> jody >> >> On Fri, Mar 16, 2012 at 7:24 AM, harini.s .. <hharin...@gmail.com> wrote: >>> Hi , >>> >>> I am very new to openMPI and I just installed openMPI 4.1.5 on Linux >>> platform. Now am trying to run the examples in the folder got >>> downloaded. But when i run , I got this >>> >>>>> a.out: error while loading shared libraries: libmpi.so.0: cannot open >>>>> shared object file: No such file or directory >>> >>> I got a.out when I compile hello_c.c using mpicc command. >>> please help me to resolve this problem. >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users >
Re: [OMPI users] (no subject)
On Mar 16, 2012, at 8:51 AM, Addepalli, Srirangam V wrote: > This usually means you library path is not updated to find mpilibrarues. You > can fix this many ways, basic two steps are > > 1. Identify location of your libraries (use locate, find ) > 2. Add it to your Library path. ( export LD_LIBRARY_PATH or make changes in > .bashrc or /etc/ld.so.conf) > > > Rangam Hi Harini Rangam is right. Indeed there is even an FAQ specific for this: http://www.open-mpi.org/faq/?category=running#adding-ompi-to-path By the way, the FAQ are the best documentation around. The README file is also helpful. Worth reading both, to avoid mistakes and waste of time. If using bash on .profile or equivalent, add these lines: export PATH=/my/path/to/openmpi/bin:$PATH export LD_LIBRARY_PATH=/my/path/to/openmpi/lib:$PATH If using [t]csh on .[t]cshrc add these lines: setenv PATH /my/path/to/openmpi/bin:$PATH setenv LD_LIBRARY_PATH /my/path/to/openmpi/lib:$PATH with your actual path to openmpi replaced above, of course. I hope this helps, Gus Correa > > From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On Behalf Of > jody [jody@gmail.com] > Sent: Friday, March 16, 2012 4:04 AM > To: Open MPI Users > Subject: Re: [OMPI users] (no subject) > > Hi > > Did you run your program with mpirun? > For example: > mpirun -np 4 ./a.out > > jody > > On Fri, Mar 16, 2012 at 7:24 AM, harini.s .. <hharin...@gmail.com> wrote: >> Hi , >> >> I am very new to openMPI and I just installed openMPI 4.1.5 on Linux >> platform. Now am trying to run the examples in the folder got >> downloaded. But when i run , I got this >> >>>> a.out: error while loading shared libraries: libmpi.so.0: cannot open >>>> shared object file: No such file or directory >> >> I got a.out when I compile hello_c.c using mpicc command. >> please help me to resolve this problem. >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] (no subject)
This usually means you library path is not updated to find mpilibrarues. You can fix this many ways, basic two steps are 1. Identify location of your libraries (use locate, find ) 2. Add it to your Library path. ( export LD_LIBRARY_PATH or make changes in .bashrc or /etc/ld.so.conf) Rangam From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On Behalf Of jody [jody@gmail.com] Sent: Friday, March 16, 2012 4:04 AM To: Open MPI Users Subject: Re: [OMPI users] (no subject) Hi Did you run your program with mpirun? For example: mpirun -np 4 ./a.out jody On Fri, Mar 16, 2012 at 7:24 AM, harini.s .. <hharin...@gmail.com> wrote: > Hi , > > I am very new to openMPI and I just installed openMPI 4.1.5 on Linux > platform. Now am trying to run the examples in the folder got > downloaded. But when i run , I got this > >>> a.out: error while loading shared libraries: libmpi.so.0: cannot open >>> shared object file: No such file or directory > > I got a.out when I compile hello_c.c using mpicc command. > please help me to resolve this problem. > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] (no subject)
Hi Did you run your program with mpirun? For example: mpirun -np 4 ./a.out jody On Fri, Mar 16, 2012 at 7:24 AM, harini.s ..wrote: > Hi , > > I am very new to openMPI and I just installed openMPI 4.1.5 on Linux > platform. Now am trying to run the examples in the folder got > downloaded. But when i run , I got this > >>> a.out: error while loading shared libraries: libmpi.so.0: cannot open >>> shared object file: No such file or directory > > I got a.out when I compile hello_c.c using mpicc command. > please help me to resolve this problem. > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] (no subject)
Hi , I am very new to openMPI and I just installed openMPI 4.1.5 on Linux platform. Now am trying to run the examples in the folder got downloaded. But when i run , I got this >> a.out: error while loading shared libraries: libmpi.so.0: cannot open shared >> object file: No such file or directory I got a.out when I compile hello_c.c using mpicc command. please help me to resolve this problem.
Re: [OMPI users] (no subject)
This type of error message *usually* means that you haven't set your LD_LIBRARY_PATH to point to the intel library. Further, this *usually* means that you aren't sourcing the iccvars.sh file in your shell startup file on remote nodes (or iccvars.csh, depending on your shell). Remember that the LD_LIBRARY_PATH has to be set to include the location of the intel libraries on *all* nodes -- and since mpirun launches on remote nodes, you need to set this in your shell startup files (e.g., $HOME/.bashrc if you are using bash). On Feb 13, 2011, at 12:38 PM, lagoun brahim wrote: > hi every one > i need your help > i have a dual core machine with os linux opensuse 10.3 64bits > i configure openmpi with ifort and icc (icpc) > i compiled a wien2k code but when i run the parralel version of it i gut the > follow error message > /home/wien/lapw1_mpi: symbol lookup error: /usr/local/lib/libopen-pal.so.0: > undefined symbol: __intel_sse2_strcpy > /home/wien/lapw1_mpi: symbol lookup error: /usr/local/lib/libopen-pal.so.0: > undefined symbol: __intel_sse2_strcpy > cat: Pas de correspondance. > any suggestion > and thanks in advance > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI users] (no subject)
hi every one i need your help i have a dual core machine with os linux opensuse 10.3 64bits i configure openmpi with ifort and icc (icpc) i compiled a wien2k code but when i run the parralel version of it i gut the follow error message /home/wien/lapw1_mpi: symbol lookup error: /usr/local/lib/libopen-pal.so.0: undefined symbol: __intel_sse2_strcpy /home/wien/lapw1_mpi: symbol lookup error: /usr/local/lib/libopen-pal.so.0: undefined symbol: __intel_sse2_strcpy cat: Pas de correspondance. any suggestion and thanks in advance
Re: [OMPI users] (no subject)
On Friday 11 June 2010, asmae.elbahlo...@mpsa.com wrote: > Hello > i have a problem with parFoam, when i type in the terminal parafoam, it > lauches nothing but in the terminal i have : This is the OpenMPI mailling list, not OpenFoam. I suggest you contact the team behind OpenFoam. I also suggest that you post plain text to mailing lists in the future and not html (and while you're at it do use a descriptive subject line). /Peter > tta201@linux-qv31:/media/OpenFoam/FOAMpro/FOAMpro-1.5-2.2/FOAM-1.5-2.2/tuto >rials/icoFoam/cavity> paraFoam Xlib: extension "GLX" missing on display > ":0.0". Xlib: extension "GLX" missing on display ":0.0". Xlib: extension > "GLX" missing on display ":0.0". Xlib: extension "GLX" missing on display > ":0.0". Xlib: extension "GLX" missing on display ":0.0". Xlib: extension > "GLX" missing on display ":0.0". Xlib: extension "GLX" missing on display > ":0.0". Xlib: extension "GLX" missing on display ":0.0". ERROR: In > /home/kitware/Dashboard/MyTests/ParaView-3-8/ParaView-3.8/ParaView/VTK/Rend >ering/vtkXOpenGLRenderWindow.cxx, line 404 vtkXOpenGLRenderWindow > (0x117b3d0): Could not find a decent visual > Xlib: extension "GLX" missing on display ":0.0". > Xlib: extension "GLX" missing on display ":0.0". > Xlib: extension "GLX" missing on display ":0.0". > Xlib: extension "GLX" missing on display ":0.0". > Xlib: extension "GLX" missing on display ":0.0". > Xlib: extension "GLX" missing on display ":0.0". > Xlib: extension "GLX" missing on display ":0.0". > Xlib: extension "GLX" missing on display ":0.0". > ERROR: In > /home/kitware/Dashboard/MyTests/ParaView-3-8/ParaView-3.8/ParaView/VTK/Rend >ering/vtkXOpenGLRenderWindow.cxx, line 404 vtkXOpenGLRenderWindow > (0x117b3d0): Could not find a decent visual > Xlib: extension "GLX" missing on display ":0.0". > ERROR: In > /home/kitware/Dashboard/MyTests/ParaView-3-8/ParaView-3.8/ParaView/VTK/Rend >ering/vtkXOpenGLRenderWindow.cxx, line 611 vtkXOpenGLRenderWindow > (0x117b3d0): GLX not found. Aborting. > > /media/OpenFoam/FOAMpro/FOAMpro-1.5-2.2/FOAM-1.5-2.2/bin/paraFoam: line 81: > 15497 Aborted paraview --data=$caseFile > > > I don't understand the problem, can someone help me please? > thanks -- Peter Kjellström | E-mail: c...@nsc.liu.se National Supercomputer Centre | Sweden | http://www.nsc.liu.se signature.asc Description: This is a digitally signed message part.
[OMPI users] (no subject)
asmae.elbahlo...@mpsa.com
Re: [OMPI users] (no subject)
The functionality of checkpoint operation is not tied to CPU utilization. Are you running with the C/R thread enabled? If not then the checkpoint might be waiting until the process enters the MPI library. Does the system emit an error message describing the error that it encountered? The C/R support does require that all processes be between MPI_INIT and MPI_FINALIZE. It is difficult to guarantee that the job is between these two functions globally (there are race conditions to worry about). This might be causing the problem as well since if some of the processes have not passed through MPI_INIT then some of the support services might not be properly initialized. Let me know what you find, and we can start looking at what might be causing this problem. -- Josh On May 11, 2010, at 5:35 PM, wrote: Hi I am using open-mpi 1.3.4 with BLCR. Sometimes I am running into a strange problem with ompi-checkpoint command. Even though I see that all MPI processes (equal to np argument) are running, ompi- checkpoint command fails at times. I have seen this failure always when the MPI processes spawned are not fully running ie; these processes are not running above 90% CPU utilization. How do I ensure that the MPI processes are fully up and running before I issue ompi- checkpoint because dynamically detecting if the processes are utilizing above 90% CPU resources is not easy. Are there any MCA parameters I can use to overcome this issue? Thanks Ananda Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] (no subject)
Ralph When you say manually, do you mean setting these parameters in the command line while calling mpirun, ompi-restart, and ompi-checkpoint? Or is there another way to set these parameters? Thanks Ananda == Subject: Re: [OMPI users] opal_cr_tmp_dir From: Ralph Castain (rhc_at_[hidden]) List-Post: users@lists.open-mpi.org Date: 2010-05-12 18:09:17 * Previous message: ananda.mudar_at_[hidden]: "Re: [OMPI users] opal_cr_tmp_dir" <http://www.open-mpi.org/community/lists/users/2010/05/13037.php> * In reply to: ananda.mudar_at_[hidden]: "Re: [OMPI users] opal_cr_tmp_dir" <http://www.open-mpi.org/community/lists/users/2010/05/13037.php> You shouldn't have to, but there may be a bug in the system. Try manually setting both envars and see if it fixes the problem. On May 12, 2010, at 3:59 PM, <ananda.mudar_at_[hidden]> wrote: > Ralph > > I have these parameters set in ~/.openmpi/mca-params.conf file > > $ cat ~/.openmpi/mca-params.conf > > orte_tmpdir_base = /home/ananda/ORTE > > opal_cr_tmp_dir = /home/ananda/OPAL > > $ > > > > Should I be setting OMPI_MCA_opal_cr_tmp_dir? > > > > FYI, I am using openmpi 1.3.4 with blcr 0.8.2 > > > Thanks > > Ananda > > = > > Subject: Re: [OMPI users] opal_cr_tmp_dir > From: Ralph Castain (rhc_at_[hidden]) > Date: 2010-05-12 16:47:16 > > Previous message: Jeff Squyres: "Re: [OMPI users] getc in openmpi" > In reply to: ananda.mudar_at_[hidden]: "Re: [OMPI users] opal_cr_tmp_dir" > ompi-restart just does a fork/exec of the mpirun, so it should get the param if it is in your environ. How are you setting it? Have you tried adding OMPI_MCA_opal_cr_tmp_dir= to your environment? > > On May 12, 2010, at 12:45 PM, <ananda.mudar_at_[hidden]> wrote: > > > Thanks Ralph. > > > > Another question. Even though I am setting opal_cr_tmp_dir to a directory other than /tmp while calling ompi-restart command, this setting is not getting passed to the mpirun command that gets generated by ompi-restart. How do I overcome this constraint? > > > > > > > > Thanks > > > > Ananda > > > > == > > > > Subject: Re: [OMPI users] opal_cr_tmp_dir > > From: Ralph Castain (rhc_at_[hidden]) > > Date: 2010-05-12 14:38:00 > > > > Previous message: ananda.mudar_at_[hidden]: "[OMPI users] opal_cr_tmp_dir" > > In reply to: ananda.mudar_at_[hidden]: "[OMPI users] opal_cr_tmp_dir" > > It's a different MCA param: orte_tmpdir_base > > > > On May 12, 2010, at 12:33 PM, <ananda.mudar_at_[hidden]> wrote: > > > > > I am setting the MCA parameter "opal_cr_tmp_dir" to a directory other than /tmp while calling "mpirun", "ompi-restart", and "ompi-checkpoint" commands so that I don't fill up /tmp filesystem. But I see that openmpi-sessions* directory is still getting created under /tmp. How do I overcome this problem so that openmpi-sessions* directory also gets created under the same directory I have defined for "opal_cr_tmp_dir"? > > > > > > Is there a way to clean up these temporary files after their requirement is over? > > > > > > Thanks > > > Ananda > > > Please do not print this email unless it is absolutely necessary. > > > > > > The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. > > > > > > WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. > > > > > > www.wipro.com > > > > > > ___ > > > users mailing list > > > users_at_[hidden] > > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > Please do not print this email unless it is absolutely necessary. > > > > The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail
Re: [OMPI users] (no subject)
Jeff Squyres wrote: On Feb 21, 2010, at 10:25 AM, Rodolfo Chua wrote: I used openMPI compiled with the GNU (gcc) compiler to run GULP code in parallel. But when I try to input "mpirun -np 2 gulp ", GULP did not run in two processors. Can you give me any suggestion on how to compile GULP code exactly with openMPI. Below is the instruction from GULP code manual. "If you wish to run the program in parallel using MPI then you will need to alter the file "getmachine" accordingly. The usual changes would be to add the "-DMPI" option and in some cases change the compiler name (for example tompif77/mpif90) or include the MPI libraries in the link stage." I'm afraid that I don't know the GULP code in particular, but their advice is sound: adding -DMPI sounds like something specific to their code (e.g., to activate the MPI code sections). But using mpif77 / mpif90 as your compiler name in their build process is probably the Right thing to do (e.g., instead of ifort / gfortran / pgf77 / whatever). This should build their executable with Open MPI's support libraries linked in, etc. What Jeff said sounds right (as usual). But, I'm intrigued about one point. Even if one did not compile for MPI, if you launch with "mpirun -np 2 gulp", I would think you would still see two processes. They would not be two processes of the same MPI job, but two replicas of the same serial job. So, I'm curious what Rodolfo's second sentence ("But when I try ...") means.
Re: [OMPI users] (no subject)
On Feb 21, 2010, at 10:25 AM, Rodolfo Chua wrote: > I used openMPI compiled with the GNU (gcc) compiler to run GULP code in > parallel. > But when I try to input "mpirun -np 2 gulp ", GULP did not run in two > processors. Can you give me any suggestion on how to compile GULP code > exactly with openMPI. > > Below is the instruction from GULP code manual. > "If you wish to run the program in parallel using MPI then you will need to > alter > the file "getmachine" accordingly. The usual changes would be to add the > "-DMPI" > option and in some cases change the compiler name (for example > tompif77/mpif90) > or include the MPI libraries in the link stage." I'm afraid that I don't know the GULP code in particular, but their advice is sound: adding -DMPI sounds like something specific to their code (e.g., to activate the MPI code sections). But using mpif77 / mpif90 as your compiler name in their build process is probably the Right thing to do (e.g., instead of ifort / gfortran / pgf77 / whatever). This should build their executable with Open MPI's support libraries linked in, etc. -- Jeff Squyres jsquy...@cisco.com For corporate legal information go to: http://www.cisco.com/web/about/doing_business/legal/cri/
[OMPI users] (no subject)
Hi! I used openMPI compiled with the GNU (gcc) compiler to run GULP code in parallel. But when I try to input "mpirun -np 2 gulp ", GULP did not run in two processors. Can you give me any suggestion on how to compile GULP code exactly with openMPI. Below is the instruction from GULP code manual. "If you wish to run the program in parallel using MPI then you will need to alter the file "getmachine" accordingly. The usual changes would be to add the "-DMPI" option and in some cases change the compiler name (for example tompif77/mpif90) or include the MPI libraries in the link stage."
Re: [OMPI users] (no subject)
Hi Konstantinos, list If you want "qsub" you need to install the resource manager / queue system in your PC. Assuming your PC is a Linux box, if your resource manager is Torque/PBS on some Linux distributions it can be installed from an rpm through yum (or equivalent mechanism), for instance. I am not sure, but I would guess SGE and SLURM may also be available through rmps also. Or you can install the resource manager from source. We have workstations/PCs here running Torque (installed through yum and rpm), for the convenience of submitting jobs as in a cluster, and letting the queue control them. You could also use just "mpiexec" directly. This doesn't require a resource manager, but you have to be the resource manager yourself, baby-sitting the jobs, submitting one at a time, waiting for completion, etc. On another related issue, let's say your 2 processors are dual core, for a total of 4 cores. Then you can count on submitting "mpiexec" with a number of processes up to 4 ( "-n 4" or "-np 4"). If you use more than that 4, say "-np 6", you are oversubscribing the physical cores. Linux will have to make the 6 processes take turns in using the 4 cores. (Some resource managers won't let you do this.) Oversubscription can work for lightweight MPI jobs, but in my experience it eventually hangs for heavier computation/communication codes. Also, note that any interactive work that you may be doing on your PC, concurrently with the MPI jobs, will have an impact on performance, and may even take the MPI jobs to a halt. We had this experience here, when the user of the aforementioned workstation insisted in running Matlab, browsing the web, watching streaming video, listening to music, while the MPI jobs were running. :) I hope this helps. Gus Correa - Gustavo Correa Lamont-Doherty Earth Observatory - Columbia University Palisades, NY, 10964-8000 - USA - Konstantinos Angelopoulos wrote: good part of the day, I am trying to run a parallel program (that used to run in a cluster) in my double core pc. Could openmpi simulate the distribution of the parallel jobs to my 2 processors meaning will qsub work even if it is not a real cluster? thank you for reading my message and for any answer. Konstantinos Angelopoulos Post-Graduate Student Brunel University School of Engineering and Design Uxbridge, Middlesex UB8 3PH UK Contact emails: mepgk...@brunel.ac.uk ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] (no subject)
On Friday 30 October 2009, Konstantinos Angelopoulos wrote: > good part of the day, > > I am trying to run a parallel program (that used to run in a cluster) in my > double core pc. Could openmpi simulate the distribution of the parallel > jobs to my 2 processors If your program is an MPI program then, yes, OpenMPI on your PC would allow you to use both cores (assuming your job can fit on the PC of course). > meaning will qsub work even if it is not a real > cluster? qsub has nothing to do with MPI it belongs to the work load management system or batch queue system. You could install this on your PC as well (see for example torque, SGE or slurm). /Peter > thank you for reading my message and for any answer. > > Konstantinos Angelopoulos signature.asc Description: This is a digitally signed message part.
[OMPI users] (no subject)
good part of the day, I am trying to run a parallel program (that used to run in a cluster) in my double core pc. Could openmpi simulate the distribution of the parallel jobs to my 2 processors meaning will qsub work even if it is not a real cluster? thank you for reading my message and for any answer. Konstantinos Angelopoulos Post-Graduate Student Brunel University School of Engineering and Design Uxbridge, Middlesex UB8 3PH UK Contact emails: mepgk...@brunel.ac.uk
[OMPI users] (no subject)
Hi All, I compiled OpenMPI in windows server 2003 through Cygwin and also through CMake and Visual Studio. In both the method I successfully complied in cygwin I configured with following command ./configure --enable-mca-no-build=timer-windows,memory_mallopt,maffinity,paffinity without these flags I was getting error. I got same error while running mpirun.exe/orterun.exe. Can anyone help me to rectify these errors. C:\openmpi_sln\debug>orterun.exe -np 2 ipconfig [8puq2akbo:07476] mca: base: component_find: "mca_paffinity_windows" does not appear to be a valid paffinity MCA dynamic component (ignored): The specifie d module could not be found. -- It looks like opal_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during opal_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): opal_paffinity_base_select failed --> Returned value -13 instead of OPAL_SUCCESS -- [8puq2akbo:07476] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file .. \..\Linpack\Source\orte\runtime\orte_init.c at line 79 [8puq2akbo:07476] [[INVALID],INVALID] ORTE_ERROR_LOG: Not found in file .. \..\..\..\Linpack\Source\orte\tools\orterun\orterun.c at line 570 Thanks, Basant Please do not print this email unless it is absolutely necessary. The information contained in this electronic message and any attachments to this message are intended for the exclusive use of the addressee(s) and may contain proprietary, confidential or privileged information. If you are not the intended recipient, you should not disseminate, distribute or copy this e-mail. Please notify the sender immediately and destroy all copies of this message and any attachments. WARNING: Computer viruses can be transmitted via email. The recipient should check this email and any attachments for the presence of viruses. The company accepts no liability for any damage caused by any virus transmitted by this email. www.wipro.com
[OMPI users] (no subject)
dear sir i am sending the details as follows 1. i am using openmpi-1.3.3 and blcr 0.8.2 2. i have installed blcr 0.8.2 first under /root/MS 3. then i installed openmpi 1.3.3 under /root/MS 4 i have configured and installed open mpi as follows #./configure --with-ft=cr --enable-mpi-threads --with-blcr=/usr/local/bin --with-blcr-libdir=/usr/local/lib # make # make install then i added the following to the .bash_profile under home directory( i went to home directory by doing cd ~) /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr_imports.ko /sbin/insmod /usr/local/lib/blcr/2.6.23.1-42.fc8/blcr.ko PATH=$PATH:/usr/local/bin MANPATH=$MANPATH:/usr/local/man LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/usr/local/lib then i compiled and run the file arr_add.c as follows [root@localhost examples]# mpicc -o res arr_add.c [root@localhost examples]# mpirun -np 2 -am ft-enable-cr ./res 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 -- Error: The process with PID 5790 is not checkpointable. This could be due to one of the following: - An application with this PID doesn't currently exist - The application with this PID isn't checkpointable - The application with this PID isn't an OPAL application. We were looking for the named files: /tmp/opal_cr_prog_write.5790 /tmp/opal_cr_prog_read.5790 -- [localhost.localdomain:05788] local) Error: Unable to initiate the handshake with peer [[7788,1],1]. -1 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 567 [localhost.localdomain:05788] [[7788,0],0] ORTE_ERROR_LOG: Error in file snapc_full_global.c at line 1054 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 NOTE: the PID of mpirun is 5788 i geve the following command for taking the checkpoint [root@localhost examples]#ompi-checkpoint -s 5788 i got the following output , but it was hanging like this [localhost.localdomain:05796] Requested - Global Snapshot Reference: (null) [localhost.localdomain:05796] Pending - Global Snapshot Reference: (null) [localhost.localdomain:05796] Running - Global Snapshot Reference: (null) kindly rectify it. with regards mallikarjuna shastry
[OMPI users] (no subject)
Hi, I found that the subroutine call inside a loop did not return correct value after certain iterations. In order to simplify the problem, the inputs to the subroutine are chosen to be constant, so the output should be the same for every iteration on every computing node. It is a fortran program, after the initialization the program goes like this: do i = 1, N call my_sub(A, B, C, re) print *, mypn, A, B, C, re end do where re is the output value of the my_sub, A, B, C are inputs to my_sub. 570 is the number of correct iterations. If the combined instances does not exceed 570, the output is fine. For example, if I requested 10 computing nodes and N were 40, so it gives 10*40=400 instances, the output would be fine. But if the combined instances exceeded 570, the first 570 is fine, but the rest will return NaN value. For example, if the number of computing nodes were 20 and N were 40, which gives 20*40=800 instances, then the first 570 are fine, but the rest are NaN value. Does someone know what might cause the problem? I googled it, but can't find a clue where to start. Please also let me know what else you need to debug the problem. Thanks. Julia __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: [OMPI users] (no subject)
The MPI standard does not define any functions for taking checkpoints from the application. The checkpoint/restart work in Open MPI is a command line driven, transparent solution. So the application does not have change in any way, and the user (or scheduler) must initiate the checkpoint from the command line (on the same node as the mpirun process). We have experimented with adding Open MPI specific checkpoint/restart interfaces in the context of the MPI Forum. These prototypes have not made it to the Open MPI trunk. Some information about that particular development is at the link below: https://svn.mpi-forum.org/trac/mpi-forum-web/wiki/Quiescence Best, Josh On Jul 6, 2009, at 12:07 AM, Mallikarjuna Shastry wrote: dear sir/madam what are the mpi functins used for taking checkpoint and restart within applicaion in mpi programs and where do i get these functions from ? with regards mallikarjuna shastry ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] (no subject)
dear sir/madam what are the mpi functins used for taking checkpoint and restart within applicaion in mpi programs and where do i get these functions from ? with regards mallikarjuna shastry
Re: [OMPI users] (no subject)
Hi, Sorry, my mistake. Attached is the config.log file. > make install > no rule to make target 'VERSION', needed by Makefile.in STOP > ompi_info --all > ompi_info: command not found Thanks, Cami -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Jeff Squyres Sent: Thursday, May 14, 2009 3:02 PM To: Open MPI Users Subject: Re: [OMPI users] (no subject) Please send all the information listed here: http://www.open-mpi.org/community/help/ On May 14, 2009, at 1:20 AM, Camelia Avram wrote: > Ni, > I'm new to MPI. I'm trying to install OpenMPI and I got some errors. > I use the command: ./configure -prefix=/usr/local - no problem with > this > But after that: "make all install", I got the next message: "no > rule to make target 'VERSION', needed by Makefile.in STOP " > What should I do? > Thanks, > Cami > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users config.log.tar.gz Description: GNU Zip compressed data
Re: [OMPI users] (no subject)
Please send all the information listed here: http://www.open-mpi.org/community/help/ On May 14, 2009, at 1:20 AM, Camelia Avram wrote: Ni, I’m new to MPI. I’m trying to install OpenMPI and I got some errors. I use the command: ./configure –prefix=/usr/local – no problem with this But after that: “make all install”, I got the next message: “no rule to make target ‘VERSION’, needed by Makefile.in STOP ” What should I do? Thanks, Cami ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
[OMPI users] (no subject)
Ni, I'm new to MPI. I'm trying to install OpenMPI and I got some errors. I use the command: ./configure -prefix=/usr/local - no problem with this But after that: "make all install", I got the next message: "no rule to make target 'VERSION', needed by Makefile.in STOP " What should I do? Thanks, Cami
[OMPI users] (no subject)
Hello! How i can integrated my collective communication algorithm in openMPI with MCA?
Re: [OMPI users] (no subject)
On May 27, 2008, at 9:33 AM, Gabriele Fatigati wrote: Great, it works! Thank you very very much. But, can you explain me how this parameter works? You might want to have a look at this short video for a little background on some relevant OpenFabrics concepts: http://www.open-mpi.org/video/?category=openfabrics#openfabrics-concepts In v1.2, for short messages, OMPI will sometimes copy your message to a pre-posted receive buffer, and immediately mark the MPI request as "complete". Depending on the timing and current network resource usage, the message may or may not have been given to the network stack yet (e.g., if we're out of flow control credits to send to this particular peer). If your application keeps dipping down into the MPI layer frequently, this situation will almost certainly resolve itself once the receiver becomes active or other events occur to free up available resources. As such, the early completion optimization pretty much depends on frequent calls to MPI. Without them, since OMPI currently has no independent progression (e.g., a progress thread), your message will wait until OMPI's internal progress engine is tripped again. Hope that helps. On Thu, 15 May 2008 21:40:45 -0400, Jeff Squyres said: Sorry this message escaped for so long it got buried in my INBOX. The problem you're seeing might be related to one we just answered about a similar situation: http://www.open-mpi.org/community/lists/users/2008/05/5657.php See if using the pml_ob1_use_early_completion flag works for you. On Apr 30, 2008, at 7:05 AM, Gabriele FATIGATI wrote: Hi, i tried to run SkaMPI benchmark on IBM-BladeCenterLS21-BCX system with 256 processors, but test has stopped on "AlltoAll-length" routine, with count=8192 for some reasons. I have launched test with: --mca btl_openib_eager_limit 1024 Same tests with 128 processor or less, have finished successful. Different values of eager limit dont' solve the problem. Thanks in advance. -- Gabriele Fatigati CINECA Systems & Tecnologies Department Supercomputing Group Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy www.cineca.itTel:39 051 6171722 g.fatig...@cineca.it ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing -- Gabriele Fatigati CINECA Systems & Tecnologies Department Supercomputing Group Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy www.cineca.itTel:39 051 6171722 g.fatig...@cineca.it ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] (no subject)
Great, it works! Thank you very very much. But, can you explain me how this parameter works? On Thu, 15 May 2008 21:40:45 -0400, Jeff Squyres said: > > Sorry this message escaped for so long it got buried in my INBOX. The > problem you're seeing might be related to one we just answered about a > similar situation: > > http://www.open-mpi.org/community/lists/users/2008/05/5657.php > > See if using the pml_ob1_use_early_completion flag works for you. > > > > On Apr 30, 2008, at 7:05 AM, Gabriele FATIGATI wrote: > > > Hi, > > i tried to run SkaMPI benchmark on IBM-BladeCenterLS21-BCX system > > with 256 processors, but test has stopped on "AlltoAll-length" > > routine, with count=8192 for some reasons. > > > > I have launched test with: > > --mca btl_openib_eager_limit 1024 > > > > Same tests with 128 processor or less, have finished successful. > > > > Different values of eager limit dont' solve the problem. Thanks in > > advance. > > -- > > Gabriele Fatigati > > > > CINECA Systems & Tecnologies Department > > > > Supercomputing Group > > > > Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy > > > > www.cineca.itTel:39 051 6171722 > > > > g.fatig...@cineca.it > > ___ > > users mailing list > > us...@open-mpi.org > > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > -- > Jeff Squyres > Cisco Systems > > ___ > users mailing -- Gabriele Fatigati CINECA Systems & Tecnologies Department Supercomputing Group Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy www.cineca.itTel:39 051 6171722 g.fatig...@cineca.it
Re: [OMPI users] (no subject)
Sorry this message escaped for so long it got buried in my INBOX. The problem you're seeing might be related to one we just answered about a similar situation: http://www.open-mpi.org/community/lists/users/2008/05/5657.php See if using the pml_ob1_use_early_completion flag works for you. On Apr 30, 2008, at 7:05 AM, Gabriele FATIGATI wrote: Hi, i tried to run SkaMPI benchmark on IBM-BladeCenterLS21-BCX system with 256 processors, but test has stopped on "AlltoAll-length" routine, with count=8192 for some reasons. I have launched test with: --mca btl_openib_eager_limit 1024 Same tests with 128 processor or less, have finished successful. Different values of eager limit dont' solve the problem. Thanks in advance. -- Gabriele Fatigati CINECA Systems & Tecnologies Department Supercomputing Group Via Magnanelli 6/3, Casalecchio di Reno (BO) Italy www.cineca.itTel:39 051 6171722 g.fatig...@cineca.it ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems
Re: [OMPI users] (no subject)
I can think of several advantages that using blocking or signals to reduce the cpu load would have: - Reduced energy consumption - Running additional background programs could be done far more efficiently - It would be much simpler to examine the load balance. It may depend on the type of program and the computational environment, but there are certainly many cases in which putting the system in idle mode would be advantageous. This is especially true for programs with low network traffic and/or high load imbalances. The "spin for a while and then block" method that you mentioned earlier seems to be a good compromise. Just do polling for some time that is long compared to the corresponding system call, and then go to sleep if nothing happens. In this way, the latency would be only marginally increased, while less cpu time is wasted in the polling loops, and I would be much happier. Jeff Squyres schrieb: > On Apr 23, 2008, at 3:49 PM, Danesh Daroui wrote: > >> Do you really mean that Open-MPI uses busy loop in order to handle >> incomming calls? It seems to be incorrect since >> spinning is a very bad and inefficient technique for this purpose. > > It depends on what you're optimizing for. :-) We're optimizing for > minimum message passing latency on hosts that are not oversubscribed; > polling is very good at that. Polling is much better than blocking, > particularly if the blocking involves a system call (which will be > "slow"). Note that in a compute-heavy environment, they nodes are > going to be running at 100% CPU anyway. > > Also keep in mind that you're only going to have "waste" spinning in > MPI if you have a loosely/poorly synchronized application. Granted, > some applications are this way by nature, but we have not chosen to > optimize spare CPU cycles for them. As I said in a prior mail, adding > a blocking strategy is on the to-do list, but it's fairly low in > priority right now. Someone may care / improve the message passing > engine to include blocking, but it hasn't happened yet. Want to work > on it? :-) > > And for reference: almost all MPI's do busy polling to minimize > latency. Some of them will shift to blocking if nothing happens for a > "long" time. This second piece is what OMPI is lacking. > >> Why >> don't you use blocking and/or signals instead of >> that? > > FWIW: I mentioned this in my other mail -- latency is quite definitely > negatively impacted when you use such mechanisms. Blocking and > signals are "slow" (in comparison to polling). > >> I think the priority of this task is very high because polling >> just wastes resources of the system. > > In production HPC environments, the entire resource is dedicated to > the MPI app anyway, so there's nothing else that really needs it. So > we allow them to busy-spin. > > There is a mode to call yield() in the middle of every OMPI progress > loop, but it's only helpful for loosely/poorly synchronized MPI apps > and ones that use TCP or shared memory. Low latency networks such as > IB or Myrinet won't be as friendly to this setting because they're > busy polling (i.e., they call yield() much less frequently, if at all). > >> On the other hand, >> what Alberto claims is not reasonable to me. >> >> Alberto, >> - Are you oversubscribing one node which means that you are running >> your >> code on a single processor machine, pretending >> to have four CPUs? >> >> - Did you compile Open-MPI or installed from RPM? >> >> Receiving process shouldn't be that expensive. >> >> Regards, >> >> Danesh >> >> >> >> Jeff Squyres skrev: >>> Because on-node communication typically uses shared memory, so we >>> currently have to poll. Additionally, when using mixed on/off-node >>> communication, we have to alternate between polling shared memory and >>> polling the network. >>> >>> Additionally, we actively poll because it's the best way to lower >>> latency. MPI implementations are almost always first judged on their >>> latency, not [usually] their CPU utilization. Going to sleep in a >>> blocking system call will definitely negatively impact latency. >>> >>> We have plans for implementing the "spin for a while and then block" >>> technique (as has been used in other MPI's and middleware layers), >>> but >>> it hasn't been a high priority. >>> >>> >>> On Apr 23, 2008, at 12:19 PM, Alberto Giannetti wrote: >>> >>> Thanks Torje. I wonder what is the benefit of looping on the incoming message-queue socket rather than using system I/O signals, like read () or select(). On Apr 23, 2008, at 12:10 PM, Torje Henriksen wrote: > Hi Alberto, > > The blocked processes are in fact spin-waiting. While they don't > have > anything better to do (waiting for that message), they will check > their incoming message-queues in a loop. > > So the MPI_Recv()-operation is blocking, but it doesn't mean that > the
Re: [OMPI users] (no subject)
Do you really mean that Open-MPI uses busy loop in order to handle incomming calls? It seems to be incorrect since spinning is a very bad and inefficient technique for this purpose. Why don't you use blocking and/or signals instead of that? I think the priority of this task is very high because polling just wastes resources of the system. On the other hand, what Alberto claims is not reasonable to me. Alberto, - Are you oversubscribing one node which means that you are running your code on a single processor machine, pretending to have four CPUs? - Did you compile Open-MPI or installed from RPM? Receiving process shouldn't be that expensive. Regards, Danesh Jeff Squyres skrev: Because on-node communication typically uses shared memory, so we currently have to poll. Additionally, when using mixed on/off-node communication, we have to alternate between polling shared memory and polling the network. Additionally, we actively poll because it's the best way to lower latency. MPI implementations are almost always first judged on their latency, not [usually] their CPU utilization. Going to sleep in a blocking system call will definitely negatively impact latency. We have plans for implementing the "spin for a while and then block" technique (as has been used in other MPI's and middleware layers), but it hasn't been a high priority. On Apr 23, 2008, at 12:19 PM, Alberto Giannetti wrote: Thanks Torje. I wonder what is the benefit of looping on the incoming message-queue socket rather than using system I/O signals, like read () or select(). On Apr 23, 2008, at 12:10 PM, Torje Henriksen wrote: Hi Alberto, The blocked processes are in fact spin-waiting. While they don't have anything better to do (waiting for that message), they will check their incoming message-queues in a loop. So the MPI_Recv()-operation is blocking, but it doesn't mean that the processes are blocked by the OS scheduler. I hope that made some sense :) Best regards, Torje On Apr 23, 2008, at 5:34 PM, Alberto Giannetti wrote: I have simple MPI program that sends data to processor rank 0. The communication works well but when I run the program on more than 2 processors (-np 4) the extra receivers waiting for data run on > 90% CPU load. I understand MPI_Recv() is a blocking operation, but why does it consume so much CPU compared to a regular system read()? #include #include #include #include #include void process_sender(int); void process_receiver(int); int main(int argc, char* argv[]) { int rank; MPI_Init(, ); MPI_Comm_rank(MPI_COMM_WORLD, ); printf("Processor %d (%d) initialized\n", rank, getpid()); if( rank == 1 ) process_sender(rank); else process_receiver(rank); MPI_Finalize(); } void process_sender(int rank) { int i, j, size; float data[100]; MPI_Status status; printf("Processor %d initializing data...\n", rank); for( i = 0; i < 100; ++i ) data[i] = i; MPI_Comm_size(MPI_COMM_WORLD, ); printf("Processor %d sending data...\n", rank); MPI_Send(data, 100, MPI_FLOAT, 0, 55, MPI_COMM_WORLD); printf("Processor %d sent data\n", rank); } void process_receiver(int rank) { int count; float value[200]; MPI_Status status; printf("Processor %d waiting for data...\n", rank); MPI_Recv(value, 200, MPI_FLOAT, MPI_ANY_SOURCE, 55, MPI_COMM_WORLD, ); printf("Processor %d Got data from processor %d\n", rank, status.MPI_SOURCE); MPI_Get_count(, MPI_FLOAT, ); printf("Processor %d, Got %d elements\n", rank, count); } ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] (no subject)
OMPI doesn't use SYSV shared memory; it uses mmaped files. ompi_info will tell you all about the components installed. If you see a BTL component named "sm", then shared memory support is installed. I do not believe that we conditionally install sm on Linux or OS X systems -- it should always be installed. ompi_info | grep btl On Apr 23, 2008, at 2:55 PM, Alberto Giannetti wrote: I am running the test program on Darwin 8.11.1, 1.83 Ghz Intel dual core. My Open MPI install is 1.2.4. I can't see any allocated shared memory segment on my system (ipcs - m), although the receiver opens a couple of TCP sockets in listening mode. It looks like my implementation does not use shared memory. Is this a configuration issue? a.out 5628 albertogiannetti3u unixR,W,NB 0x380b198 0t0 ->0x41ced48 a.out 5628 albertogiannetti4u unix R,W 0x41ced48 0t0 ->0x380b198 a.out 5628 albertogiannetti5u IPv4R,W,NB 0x3d4d920 0t0 TCP *:50969 (LISTEN) a.out 5628 albertogiannetti6u IPv4R,W,NB 0x3e62394 0t0 TCP 192.168.0.10:50970->192.168.0.10:50962 (ESTABLISHED) a.out 5628 albertogiannetti7u IPv4R,W,NB 0x422d228 0t0 TCP *:50973 (LISTEN) a.out 5628 albertogiannetti8u IPv4R,W,NB 0x2dfd394 0t0 TCP 192.168.0.10:50969->192.168.0.10:50975 (ESTABLISHED) On Apr 23, 2008, at 12:34 PM, Jeff Squyres wrote: Because on-node communication typically uses shared memory, so we currently have to poll. Additionally, when using mixed on/off-node communication, we have to alternate between polling shared memory and polling the network. Additionally, we actively poll because it's the best way to lower latency. MPI implementations are almost always first judged on their latency, not [usually] their CPU utilization. Going to sleep in a blocking system call will definitely negatively impact latency. We have plans for implementing the "spin for a while and then block" technique (as has been used in other MPI's and middleware layers), but it hasn't been a high priority. On Apr 23, 2008, at 12:19 PM, Alberto Giannetti wrote: Thanks Torje. I wonder what is the benefit of looping on the incoming message-queue socket rather than using system I/O signals, like read () or select(). On Apr 23, 2008, at 12:10 PM, Torje Henriksen wrote: Hi Alberto, The blocked processes are in fact spin-waiting. While they don't have anything better to do (waiting for that message), they will check their incoming message-queues in a loop. So the MPI_Recv()-operation is blocking, but it doesn't mean that the processes are blocked by the OS scheduler. I hope that made some sense :) Best regards, Torje On Apr 23, 2008, at 5:34 PM, Alberto Giannetti wrote: I have simple MPI program that sends data to processor rank 0. The communication works well but when I run the program on more than 2 processors (-np 4) the extra receivers waiting for data run on > 90% CPU load. I understand MPI_Recv() is a blocking operation, but why does it consume so much CPU compared to a regular system read()? #include #include #include #include #include void process_sender(int); void process_receiver(int); int main(int argc, char* argv[]) { int rank; MPI_Init(, ); MPI_Comm_rank(MPI_COMM_WORLD, ); printf("Processor %d (%d) initialized\n", rank, getpid()); if( rank == 1 ) process_sender(rank); else process_receiver(rank); MPI_Finalize(); } void process_sender(int rank) { int i, j, size; float data[100]; MPI_Status status; printf("Processor %d initializing data...\n", rank); for( i = 0; i < 100; ++i ) data[i] = i; MPI_Comm_size(MPI_COMM_WORLD, ); printf("Processor %d sending data...\n", rank); MPI_Send(data, 100, MPI_FLOAT, 0, 55, MPI_COMM_WORLD); printf("Processor %d sent data\n", rank); } void process_receiver(int rank) { int count; float value[200]; MPI_Status status; printf("Processor %d waiting for data...\n", rank); MPI_Recv(value, 200, MPI_FLOAT, MPI_ANY_SOURCE, 55, MPI_COMM_WORLD, ); printf("Processor %d Got data from processor %d\n", rank, status.MPI_SOURCE); MPI_Get_count(, MPI_FLOAT, ); printf("Processor %d, Got %d elements\n", rank, count); } ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres
Re: [OMPI users] (no subject)
I am running the test program on Darwin 8.11.1, 1.83 Ghz Intel dual core. My Open MPI install is 1.2.4. I can't see any allocated shared memory segment on my system (ipcs - m), although the receiver opens a couple of TCP sockets in listening mode. It looks like my implementation does not use shared memory. Is this a configuration issue? a.out 5628 albertogiannetti3u unixR,W,NB 0x380b198 0t0 ->0x41ced48 a.out 5628 albertogiannetti4u unix R,W 0x41ced48 0t0 ->0x380b198 a.out 5628 albertogiannetti5u IPv4R,W,NB 0x3d4d920 0t0 TCP *:50969 (LISTEN) a.out 5628 albertogiannetti6u IPv4R,W,NB 0x3e62394 0t0 TCP 192.168.0.10:50970->192.168.0.10:50962 (ESTABLISHED) a.out 5628 albertogiannetti7u IPv4R,W,NB 0x422d228 0t0 TCP *:50973 (LISTEN) a.out 5628 albertogiannetti8u IPv4R,W,NB 0x2dfd394 0t0 TCP 192.168.0.10:50969->192.168.0.10:50975 (ESTABLISHED) On Apr 23, 2008, at 12:34 PM, Jeff Squyres wrote: Because on-node communication typically uses shared memory, so we currently have to poll. Additionally, when using mixed on/off-node communication, we have to alternate between polling shared memory and polling the network. Additionally, we actively poll because it's the best way to lower latency. MPI implementations are almost always first judged on their latency, not [usually] their CPU utilization. Going to sleep in a blocking system call will definitely negatively impact latency. We have plans for implementing the "spin for a while and then block" technique (as has been used in other MPI's and middleware layers), but it hasn't been a high priority. On Apr 23, 2008, at 12:19 PM, Alberto Giannetti wrote: Thanks Torje. I wonder what is the benefit of looping on the incoming message-queue socket rather than using system I/O signals, like read () or select(). On Apr 23, 2008, at 12:10 PM, Torje Henriksen wrote: Hi Alberto, The blocked processes are in fact spin-waiting. While they don't have anything better to do (waiting for that message), they will check their incoming message-queues in a loop. So the MPI_Recv()-operation is blocking, but it doesn't mean that the processes are blocked by the OS scheduler. I hope that made some sense :) Best regards, Torje On Apr 23, 2008, at 5:34 PM, Alberto Giannetti wrote: I have simple MPI program that sends data to processor rank 0. The communication works well but when I run the program on more than 2 processors (-np 4) the extra receivers waiting for data run on > 90% CPU load. I understand MPI_Recv() is a blocking operation, but why does it consume so much CPU compared to a regular system read()? #include #include #include #include #include void process_sender(int); void process_receiver(int); int main(int argc, char* argv[]) { int rank; MPI_Init(, ); MPI_Comm_rank(MPI_COMM_WORLD, ); printf("Processor %d (%d) initialized\n", rank, getpid()); if( rank == 1 ) process_sender(rank); else process_receiver(rank); MPI_Finalize(); } void process_sender(int rank) { int i, j, size; float data[100]; MPI_Status status; printf("Processor %d initializing data...\n", rank); for( i = 0; i < 100; ++i ) data[i] = i; MPI_Comm_size(MPI_COMM_WORLD, ); printf("Processor %d sending data...\n", rank); MPI_Send(data, 100, MPI_FLOAT, 0, 55, MPI_COMM_WORLD); printf("Processor %d sent data\n", rank); } void process_receiver(int rank) { int count; float value[200]; MPI_Status status; printf("Processor %d waiting for data...\n", rank); MPI_Recv(value, 200, MPI_FLOAT, MPI_ANY_SOURCE, 55, MPI_COMM_WORLD, ); printf("Processor %d Got data from processor %d\n", rank, status.MPI_SOURCE); MPI_Get_count(, MPI_FLOAT, ); printf("Processor %d, Got %d elements\n", rank, count); } ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- Jeff Squyres Cisco Systems ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] (no subject)
I have simple MPI program that sends data to processor rank 0. The communication works well but when I run the program on more than 2 processors (-np 4) the extra receivers waiting for data run on > 90% CPU load. I understand MPI_Recv() is a blocking operation, but why does it consume so much CPU compared to a regular system read()? #include #include #include #include #include void process_sender(int); void process_receiver(int); int main(int argc, char* argv[]) { int rank; MPI_Init(, ); MPI_Comm_rank(MPI_COMM_WORLD, ); printf("Processor %d (%d) initialized\n", rank, getpid()); if( rank == 1 ) process_sender(rank); else process_receiver(rank); MPI_Finalize(); } void process_sender(int rank) { int i, j, size; float data[100]; MPI_Status status; printf("Processor %d initializing data...\n", rank); for( i = 0; i < 100; ++i ) data[i] = i; MPI_Comm_size(MPI_COMM_WORLD, ); printf("Processor %d sending data...\n", rank); MPI_Send(data, 100, MPI_FLOAT, 0, 55, MPI_COMM_WORLD); printf("Processor %d sent data\n", rank); } void process_receiver(int rank) { int count; float value[200]; MPI_Status status; printf("Processor %d waiting for data...\n", rank); MPI_Recv(value, 200, MPI_FLOAT, MPI_ANY_SOURCE, 55, MPI_COMM_WORLD, ); printf("Processor %d Got data from processor %d\n", rank, status.MPI_SOURCE); MPI_Get_count(, MPI_FLOAT, ); printf("Processor %d, Got %d elements\n", rank, count); }
[OMPI users] (no subject)
I'm having some difficulty geting the Open MPI checkpoint/restart fault tolerance working. I have compiled Open MPI with the "--with-ft=cr" flag, but when I attempt to run my test program (ring), the ompi-checkpoint command fails. I have verified that the test program works fine without the fault tolerance enabled. Here are the details: [me@dev1 ~]$ mpirun -np 4 -am ft-enable-cr ring [me@dev1 ~]$ ps -efa | grep mpirun me 3052 2820 1 08:25 pts/200:00:00 mpirun -np 4 -am ft-enable-cr ring [me@dev1 ~]$ ompi-checkpoint 3052 [dev1.acme.local:03060] [NO-NAME] ORTE_ERROR_LOG: Unknown error: 5854512 in file sds_singleton_module.c at line 50 [dev1.acme.local:03060] [NO-NAME] ORTE_ERROR_LOG: Unknown error: 5854512 in file runtime/orte_init.c at line 311 -- It looks like orte_init failed for some reason; your parallel process is likely to abort. There are many reasons that a parallel process can fail during orte_init; some of which are due to configuration or environment problems. This failure appears to be an internal failure; here's some additional information (which may only be relevant to an Open MPI developer): orte_sds_base_set_name failed --> Returned value Unknown error: 5854512 (5854512) instead of ORTE_SUCCESS -- Any help would be appreciated. Thanks. ompi_info.txt.gz Description: ompi_info.txt.gz config.log.gz Description: config.log.gz
Re: [OMPI users] (no subject)
Hi, I wrote the information below in my hostfile: 192.168.1.1 4 slots 192.168.1.2 4 slots and I entered the command below in the directory which contains my hostfile(my_host) : ~Administrator/PCal$ mpirun -np 8 -hostfile my_host --byslot hello then the following information returned: --- bash:line 1:orted:command not found. [Apple1.local:00516][0,0,0]ORTE_ERROR_LOG:Timeout in file base/pls_base,orte+cmds.c at line 275 ... [Apple1.local:00516]ERROR:A daemon on node 192.168.1.2 failed to start as expected. ... [Apple1.local:00516][0,0,0]ORTE_ERROR_LOG:Timeout in file pls_rsh_module.c at line 1187. ... mpirun was unable to cleanly terminate the daemons for this job.Returned value Timeout instead of ORTE_SUCCESS. Should there be a SSI(Single `System Image) on both of my Apple PCs?How can I do? Thank you. ÔÚÄúµÄÀ´ÐÅÖÐÔø¾Ìáµ½: >From: "Götz Waschk" <goetz.was...@gmail.com> >Reply-To: Open MPI Users <us...@open-mpi.org> >To: "Open MPI Users" <us...@open-mpi.org> >Subject: Re: [OMPI users] (no subject) >Date:Wed, 4 Apr 2007 13:28:15 +0200 > >On 4/4/07, JiaXing Cai <ca...@mail.ustc.edu.cn> wrote: >>I want to do a parallel job on 2 Apple PowerPCs(Power Ma GS Quad) which >> run on >> Mac OS X 10.4.8.How can I make them to communicate with each other using >> open-mpi? >> I have tried ,but failed.An error related to daemons has occured. > >Hi, > >could you please tell us what exactly you have tried and please >include the complete error message as well. > >Regards, Götz Waschk > >-- >AL I:40: Do what thou wilt shall be the whole of the Law. > >___ >users mailing list >us...@open-mpi.org >http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] (no subject)
Help I want to do a parallel job on 2 Apple PowerPCs(Power Ma GS Quad) which run on Mac OS X 10.4.8.How can I make them to communicate with each other using open-mpi? I have tried ,but failed.An error related to daemons has occured.
Re: [OMPI users] (no subject)
Check out "Windows Compute Cluster Server 2003", http://www.microsoft.com/windowsserver2003/ccs/default.mspx. From the FAQ: "Windows Compute Cluster Server 2003 comes with the Microsoft Message Passing Interface (MS MPI), an MPI stack based on the MPICH2 implementation from Argonne National Labs." I have no experience with it, just sharing the link. Jonathan usha devi regadi wrote: hello I'll be glad to know if an MPI is available On WINDOWS Platform. Regards usha ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users