Re: [OMPI users] ORTE daemon has unexpectedly failed after launch

2014-08-20 Thread Timur Ismagilov
little investigation today and file >> a bug. I'll make sure you're CC'ed on the bug ticket. >> >> >> >> On Aug 12, 2014, at 12:27 PM, Timur Ismagilov < tismagi...@mail.ru > wrote: >> >>> I don't have this error in OMPI 1.9a1r32252 and OMPI 1.8.1 (wit

Re: [OMPI users] ORTE daemon has unexpectedly failed after launch

2014-08-21 Thread Timur Ismagilov
ell. >> >> >>On Wed, Aug 20, 2014 at 8:06 PM, Ralph Castain < r...@open-mpi.org > wrote: >>>It was not yet fixed - but should be now. >>> >>>On Aug 20, 2014, at 6:39 AM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>>>Hello! >

[OMPI users] long initialization

2014-08-22 Thread Timur Ismagilov
Castain <r...@open-mpi.org>: >Not sure I understand. The problem has been fixed in both the trunk and the >1.8 branch now, so you should be able to work with either of those nightly >builds. > >On Aug 21, 2014, at 12:02 AM, Timur Ismagilov < tismagi...@mail.ru > wrote:

Re: [OMPI users] long initialization

2014-08-26 Thread Timur Ismagilov
is ";" . You can change delimiter with >>mca_base_env_list_delimiter. >> >> >> >>On Fri, Aug 22, 2014 at 2:59 PM, Timur Ismagilov < tismagi...@mail.ru > >>wrote: >>>Hello! >>>If i use latest night snapshot: >>>

Re: [OMPI users] long initialization

2014-08-27 Thread Timur Ismagilov
ve the time it >takes Slurm to launch our daemon on the remote host, so you get about half of >a second. > >IIRC, you were having some problems with the OOB setup. If you specify the TCP >interface to use, does your time come down? > > >On Aug 26, 2014, at 8:32 AM, Timur

Re: [OMPI users] long initialization

2014-08-28 Thread Timur Ismagilov
bose 100" >to your cmd line > >On Aug 27, 2014, at 4:31 AM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>When i try to specify oob with --mca oob_tcp_if_include >from ifconfig>, i alwase get err

Re: [OMPI users] long initialization

2014-08-28 Thread Timur Ismagilov
0m4.166s user 0m0.034s sys 0m0.079s Thu, 28 Aug 2014 13:10:02 +0400 от Timur Ismagilov <tismagi...@mail.ru>: >I enclosure 2 files with output of two foloowing commands (OMPI 1.9a1r32570) >$time mpirun --leave-session-attached -mca oob_base_verbose 100 -np 1 >./hello_c >& out1.

[OMPI users] open shmem optimization

2014-08-29 Thread Timur Ismagilov
Hello! What param can i tune to increase perfomance(scalability) for my app (all to all pattern with message size = constant/nnodes)? I can read  this faq  for mpi, but is it correct for shmem? I have 2 programm doing the same thing(with same input) each node send messages(message size =

[OMPI users] shmalloc error with >=512 mb

2014-11-17 Thread Timur Ismagilov
Hello! Why does shmalloc return NULL when I try to allocate 512MB. When i thry to allocate 256mb - all fine. I use Open MPI/SHMEM v1.8.4 rc1 (v1.8.3-202-gb568b6e). programm: #include #include int main(int argc, char **argv) { int *src; start_pes(0); int length = 1024*1024*512; src = (int*)

[OMPI users] MXM problem

2015-05-25 Thread Timur Ismagilov
Hello! I use ompi-v1.8.4 from hpcx-v1.3.0-327-icc-OFED-1.5.3-redhat6.2; OFED-1.5.4.1; CentOS release 6.2; infiniband 4x FDR I have two problems: 1. I can not use mxm : 1.a) $mpirun --mca pml cm --mca mtl mxm -host node5,node14,node28,node29 -mca plm_rsh_no_tree_spawn 1 -np 4 ./hello

Re: [OMPI users] MXM problem

2015-05-25 Thread Timur Ismagilov
2015, 9:04 -07:00 от Ralph Castain <r...@open-mpi.org>: >I can’t speak to the mxm problem, but the no-tree-spawn issue indicates that >you don’t have password-less ssh authorized between the compute nodes > > >>On May 25, 2015, at 8:55 AM, Timur Ismagilov < tismagi.

Re: [OMPI users] MXM problem

2015-05-25 Thread Timur Ismagilov
_PATH,PATH, LD_PRELOAD and OPAL_PREFIX that it is pointing to the >right mpirun? > >Also, could you please check that yalla is present in the ompi_info -l 9 >output? > >Thanks > >On Mon, May 25, 2015 at 7:11 PM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>I can

[OMPI users] Fwd: Re[4]: MXM problem

2015-05-25 Thread Timur Ismagilov
lease select export MXM_IB_PORTS=mlx4_0:1 explicitly and retry > >On Mon, May 25, 2015 at 8:26 PM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>Hi, Mike, >>that is what i have: >>$ echo $LD_LIBRARY_PATH | tr ":" "\n" >>/gpfs/NETHOME/oivt1

Re: [OMPI users] MXM problem

2015-05-26 Thread Timur Ismagilov
:53 +03:00 от Mike Dubman <mi...@dev.mellanox.co.il>: >Hi Timur, > >Here it goes: > >wget >ftp://bgate.mellanox.com/hpc/hpcx/custom/v1.3/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64.tbz > >Please let me know if it works for you and will add 1.5.4.1 mofed to the >

Re: [OMPI users] MXM problem

2015-05-26 Thread Timur Ismagilov
t; -mca plm_base_verbose 5  -mca oob_base_verbose 10 -mca rml_base_verbose 10 >--debug-daemons > >On Tue, May 26, 2015 at 11:38 AM, Timur Ismagilov < tismagi...@mail.ru > >wrote: >>1. mxm_perf_test - OK. >>2. no_tree_spawn  - OK. >>3. ompi yalla and "--mca pml c

Re: [OMPI users] MXM problem

2015-05-28 Thread Timur Ismagilov
t;>Alina - could you please take a look? >>Thx >> >> >>-- Forwarded message -- >>From: Timur Ismagilov < tismagi...@mail.ru > >>Date: Tue, May 26, 2015 at 12:40 PM >>Subject: Re[12]: [OMPI users] MXM problem >>To: Open MPI Users < u

Re: [OMPI users] MXM problem

2015-05-28 Thread Timur Ismagilov
> >On Thu, May 28, 2015 at 10:21 AM, Timur Ismagilov < tismagi...@mail.ru > >wrote: >>I'm sorry for the delay . >> >>Here it is: >>( I used 5 min time limit ) >>/gpfs/NETHOME/oivt1/nicevt/itf/sources/hpcx-v1.3.330-icc-OFED-1.5.4.1-redhat6.2-x86_64/

Re: [OMPI users] Fwd[2]: OMPI yalla vs impi

2015-06-02 Thread Timur Ismagilov
lgorithm. > >To see benefit of yalla - you should run p2p benchmarks (osu_lat/bw/bibw/mr) > > >On Thu, May 28, 2015 at 7:35 PM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>I compare ompi-1.8.5 (hpcx-1.3.3-icc) with impi v 4.1.4. >> >>I build ompi with MX

Re: [OMPI users] OMPI yalla vs impi

2015-06-03 Thread Timur Ismagilov
a and without hcoll? > >Thanks, >Alina. > > > >On Tue, Jun 2, 2015 at 4:56 PM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>Hi, Mike! >>I have impi v 4.1.2 (- impi) >>I build ompi 1.8.5 with MXM and hcoll (- ompi_yalla) >>I build ompi 1.8.

Re: [OMPI users] Fwd[2]: OMPI yalla vs impi

2015-06-04 Thread Timur Ismagilov
ults? > >2. add the following to the command line that you are running with 'pml yalla' >and attach the results? >"-x MXM_TLS=self,shm,rc" > >3. run your command line with yalla and without hcoll? > >Thanks, >Alina. > > > >On Tue, Jun 2, 2015 at 4:56

Re: [OMPI users] Fwd[2]: OMPI yalla vs impi

2015-06-16 Thread Timur Ismagilov
t; >1. --map-by node --bind-to socket >2. --map-by node --bind-to core > >Please attach your results. > >Thank you, >Alina. > >On Thu, Jun 4, 2015 at 6:53 PM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>Hello, Alina. >>1. Here is my  >>o

Re: [OMPI users] Fwd[2]: OMPI yalla vs impi

2015-06-16 Thread Timur Ismagilov
ase try running your  ompi_yalla cmd with ' --bind-to socket' >(instead of binding to core) and check if it affects the results? >We saw that it made a difference on the performance in our lab so that's why I >asked you to try the same. > >Thanks, >Alina. > >On Tue, J

Re: [OMPI users] Fwd[2]: OMPI yalla vs impi

2015-06-19 Thread Timur Ismagilov
ina Sklarevich <ali...@dev.mellanox.co.il>: >Hi Timur, > >Can you please tell me which osu version you are using? >Unless it is from HPCX, please attach the source file of osu_mbw_mr.c you are >using. > >Thank you, >Alina. > >On Tue, Jun 16, 2015 at 7:10 PM, Timur I

[OMPI users] spml_ikrit_np random values

2014-06-05 Thread Timur Ismagilov
Hello! I am using Open MPI v1.8.1. $oshmem_info -a --parsable | grep spml_ikrit_np mca:spml:ikrit:param:spml_ikrit_np:value:1620524368  (alwase new value) mca:spml:ikrit:param:spml_ikrit_np:source:default mca:spml:ikrit:param:spml_ikrit_np:status:writeable

Re: [OMPI users] [warn] Epoll ADD(1) on fd 0 failed

2014-06-06 Thread Timur Ismagilov
is is >some interaction with sbatch, but I'll take a look. I haven't see that >warning. Mike indicated he thought it is due to both slurm and OMPI trying to >control stdin/stdout, in which case it shouldn't be happening but you can >safely ignore it > > >On Jun 5, 2014, at 3:

[OMPI users] Problem with yoda component in oshmem.

2014-06-06 Thread Timur Ismagilov
Hello! I am using Open MPI v1.8.1 in example program hello_oshmem.cpp. When I put  spml_ikrit_np = 1000 (more than 4) and run task on 4 (2,1) nodes, I get an: in out file:  No available spml components were found! This means that there are no components of this type installed on your system or

[OMPI users] openMP and mpi problem

2014-07-02 Thread Timur Ismagilov
Hello! I have open mpi 1.9a1r32104 and open mpi 1.5.5. I have much better perfomance in open mpi 1.5.5 with openMP on 8 cores in  the program: #define N 1000 int main(int argc, char *argv[]) { ... MPI_Init(, ); ... for (i = 0; i < N; i++) { a[i] = i * 1.0; b[i] =

Re: [OMPI users] openMP and mpi problem

2014-07-03 Thread Timur Ismagilov
s an interactive job and use top itself, is there? > >As for that sbgp warning - you can probably just ignore it. Not sure why that >is failing, but it just means that component will disqualify itself. If you >want to eliminate it, just add > >-mca sbgp ^ibnet > >to your cmd line > > >

Re: [OMPI users] openMP and mpi problem

2014-07-04 Thread Timur Ismagilov
n of OMPI. Please check >your LD_LIBRARY_PATH and ensure that the 1.9 installation is at the *front* of >that list. > >Of course, I'm also assuming that you installed the two versions into >different locations - yes? > >Also, add "--mca rmaps_base_verbose 20" to your cmd line -

Re: [OMPI users] openMP and mpi problem

2014-07-04 Thread Timur Ismagilov
s? I >suspect I know the problem, but need to see the actual cmd line to confirm it > >Thanks >Ralph > >On Jul 4, 2014, at 1:38 AM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>There is only one path to mpi lib. >>echo $LD_LIBRARY_PATH   >>/opt/int

[OMPI users] Salloc and mpirun problem

2014-07-16 Thread Timur Ismagilov
Hello! I have Open MPI v1.9a1r32142 and slurm 2.5.6. I can not use mpirun after salloc: $salloc -N2 --exclusive -p test -J ompi $LD_PRELOAD=/mnt/data/users/dm2/vol3/semenov/_scratch/mxm/mxm-3.0/lib/libmxm.so mpirun -np 1 hello_c

Re: [OMPI users] Salloc and mpirun problem

2014-07-16 Thread Timur Ismagilov
uot; and attach output. >Thx > > >On Wed, Jul 16, 2014 at 11:12 AM, Timur Ismagilov < tismagi...@mail.ru > >wrote: >>Hello! >>I have Open MPI v1.9a1r32142 and slurm 2.5.6. >> >>I can not use mpirun after salloc: >> >>$salloc -N2 --exclusive

Re: [OMPI users] Salloc and mpirun problem

2014-07-17 Thread Timur Ismagilov
Thu, 17 Jul 2014 11:40:24 +0300 от Mike Dubman <mi...@dev.mellanox.co.il>: >can you use latest ompi-1.8 from svn/git? >Ralph - could you please suggest. >Thx > > >On Wed, Jul 16, 2014 at 2:48 PM, Timur Ismagilov < tismagi...@mail.ru > wrote: >>Here it is: >>

[OMPI users] Fwd: Re[4]: Salloc and mpirun problem

2014-07-20 Thread Timur Ismagilov
I have the same problem in openmpi 1.8.1( Apr 23, 2014 ). Does the srun command have  a --map-by mpirun parameter, or can i chage it from bash enviroment? Пересылаемое сообщение От кого: Timur Ismagilov <tismagi...@mail.ru> Кому: Mike Dubman <mi...@dev.mellanox.co.

Re: [OMPI users] Fwd: Re[4]: Salloc and mpirun problem

2014-07-20 Thread Timur Ismagilov
Пересылаемое сообщение От кого: Timur Ismagilov <tismagi...@mail.ru> Кому: Ralph Castain <r...@open-mpi.org> Дата: Sun, 20 Jul 2014 21:58:41 +0400 Тема: Re[2]: [OMPI users] Fwd: Re[4]: Salloc and mpirun problem Here it is: $ salloc -N2 --exclusive -p test -J

Re: [OMPI users] Fwd: Re[4]: Salloc and mpirun problem

2014-07-21 Thread Timur Ismagilov
ubnet - this >is generally a bad idea. I also saw that the last one in the list shows up >twice in the kernel array - not sure why, but is there something special about >that NIC? > >What do the NICs look like on the remote hosts? > >On Jul 20, 2014, at 10:59 AM, Timur Ismagilov

Re: [OMPI users] Salloc and mpirun problem

2014-07-23 Thread Timur Ismagilov
:17849] mca: base: close: component oob closed >>[node1-128-18:17849] mca: base: close: unloading component oob >>[node1-128-18:17849] [[65177,0],2] TCP SHUTDOWN >>[node1-128-18:17849] [[65177,0],2] RELEASING PEER OBJ [[65177,0],0] >>[node1-128-18:17849] [[65177,0],2] CLOSING

[OMPI users] ORTE daemon has unexpectedly failed after launch

2014-08-12 Thread Timur Ismagilov
Hello! I have Open MPI  v1.8.2rc4r32485 When i run hello_c, I got this error message $mpirun  -np 2 hello_c An ORTE daemon has unexpectedly failed after launch and before communicating back to mpirun. This could be caused by a number of factors, including an inability to create a connection

[OMPI users] ORTE daemon has unexpectedly failed after launch

2014-08-12 Thread Timur Ismagilov
base: close: component oob closed [compiler-2:08792] mca: base: close: unloading component oob [compiler-2:08792] [[42190,0],0] TCP SHUTDOWN [compiler-2:08792] mca: base: close: component tcp closed [compiler-2:08792] mca: base: close: unloading component tcp Tue, 12 Aug 2014 16:14:58 +0400 от Timu

[OMPI users] mpi+openshmem hybrid

2014-08-14 Thread Timur Ismagilov
Hello! I use Open MPI v1.9a132520. Can I use hybrid mpi+openshmem? Where can i read about? I have some problems in simple programm: #include #include "shmem.h" #include "mpi.h" int main(int argc, char* argv[]) { int proc, nproc; int rank, size, len; char