Re: [OMPI users] ompi-restart failed && ompi-migrate

2012-04-11 Thread kidd
Hello ! I check  my OS(ubuntu 11)  . it not install prelink . Are there other reasons? (ompi-restart)   thanks . 寄件者: Josh Hursey 收件者: Open MPI Users 寄件日期: 2012/4/11 (週三) 8:36 PM 主旨: Re: [OMPI users] ompi-restart

[OMPI users] MPI_Send, MPI_Recv problem on Mac and Linux

2012-04-11 Thread Peter Sels
Dear openMPI users, I think this should be an easy question to anyone with more experience than an openMPI-hello-world-program... I wrote some openMPI code, where the master sends a length and then a buffer with that length as 2 subsequent MPI messages. The slave is receiving these messages and

Re: [OMPI users] ompi-restart failed && ompi-migrate

2012-04-11 Thread Josh Hursey
The 1.5 series does not support process migration, so there is no ompi-migrate option there. This was only contributed to the trunk (1.7 series). However, changes to the runtime environment over the past few months have broken this functionality. It is currently unclear when this will be repaired.

Re: [OMPI users] sge tight integration leads to bad allocation

2012-04-11 Thread Ralph Castain
On Apr 11, 2012, at 6:20 AM, Reuti wrote: > Am 11.04.2012 um 04:26 schrieb Ralph Castain: > >> Hi Reuti >> >> Can you replicate this problem on your machine? Can you try it with 1.5? > > No. It's also working fine in 1.5.5 in some tests. I even forced an uneven > distribution by limiting the

Re: [OMPI users] sge tight integration leads to bad allocation

2012-04-11 Thread Reuti
Am 11.04.2012 um 04:26 schrieb Ralph Castain: > Hi Reuti > > Can you replicate this problem on your machine? Can you try it with 1.5? No. It's also working fine in 1.5.5 in some tests. I even forced an uneven distribution by limiting the slots setting for some machines in the queue

Re: [OMPI users] wrong core binding by openmpi-1.5.5

2012-04-11 Thread Ralph Castain
Ouch - finally figured out what happened. Jeff and I did indeed address this problem a few weeks ago. There were some changes required in a couple of places to make it all work, so we did the work in a Mercurial branch Jeff set up. Unfortunately, I think he got distracted by the MPI Forum

Re: [OMPI users] wrong core binding by openmpi-1.5.5

2012-04-11 Thread Ralph Castain
Interesting. Jeff and I had discussed that very problem not that long ago, and I could swear he fixed it - but I don't see the CMR for that code. He's on vacation this week, so I'll wait for his return to look at it. Thanks! Ralph On Apr 11, 2012, at 2:36 AM, Brice Goglin wrote: > A quick

Re: [OMPI users] wrong core binding by openmpi-1.5.5

2012-04-11 Thread Brice Goglin
Here's a better patch. Still only compile tested :) Brice Le 11/04/2012 10:36, Brice Goglin a écrit : > A quick look at the code seems to confirm my feeling. get/set_module() > callbacks manipulate arrays of logical indexes, and they do not convert > them back to physical indexes before binding.

Re: [OMPI users] wrong core binding by openmpi-1.5.5

2012-04-11 Thread Brice Goglin
A quick look at the code seems to confirm my feeling. get/set_module() callbacks manipulate arrays of logical indexes, and they do not convert them back to physical indexes before binding. Here's a quick patch that may help. Only compile tested... Brice Le 11/04/2012 09:49, Brice Goglin a

Re: [OMPI users] wrong core binding by openmpi-1.5.5

2012-04-11 Thread Brice Goglin
Le 11/04/2012 09:06, tmish...@jcity.maeda.co.jp a écrit : > Hi, Brice. > > I installed the latest hwloc-1.4.1. > Here is the output of lstopo -p. > > [root@node03 bin]# ./lstopo -p > Machine (126GB) > Socket P#0 (32GB) > NUMANode P#0 (16GB) + L3 (5118KB) > L2 (512KB) + L1 (64KB) + Core

Re: [OMPI users] wrong core binding by openmpi-1.5.5

2012-04-11 Thread tmishima
Hi, Brice. I installed the latest hwloc-1.4.1. Here is the output of lstopo -p. [root@node03 bin]# ./lstopo -p Machine (126GB) Socket P#0 (32GB) NUMANode P#0 (16GB) + L3 (5118KB) L2 (512KB) + L1 (64KB) + Core P#0 + PU P#0 L2 (512KB) + L1 (64KB) + Core P#1 + PU P#4 L2

Re: [OMPI users] wrong core binding by openmpi-1.5.5

2012-04-11 Thread Brice Goglin
Can you send the output of lstopo -p ? (you'll have to install hwloc) Brice tmish...@jcity.maeda.co.jp a écrit : Hi, I updated openmpi from version 1.5.4 to 1.5.5. Then, an execution speed of my application becomes quite slower than before, due to wrong core bindings. As far as I checked, it

[OMPI users] wrong core binding by openmpi-1.5.5

2012-04-11 Thread tmishima
Hi, I updated openmpi from version 1.5.4 to 1.5.5. Then, an execution speed of my application becomes quite slower than before, due to wrong core bindings. As far as I checked, it seems that openmpi-1.5.4 gives correct core bindings for my magnycore based machine. 1) my script is as follows:

[OMPI users] ompi-restart failed && ompi-migrate

2012-04-11 Thread kidd
Hello !  I had some  problems . This is My environment    BLCR= 0.8.4   , openMPI= 1.5.5  , OS= ubuntu 11.04    I have 2 Node : cuda05(Master ,it have NFS  file system)  , cuda07(slave ,mount Master)    I had also set  ~/.openmpi/mca-params.conf->