Re: [hwloc-users] Compiling hwloc into a static library on Windows and Linux

2012-01-11 Thread Andrew Helwer
> To be clear: I think you're misunderstanding what --enable-embedded-
> mode is for.  Per Samuel's comment, I think you want --enable-static
> (and possibly --disable-shared).

Ah yes, I was misunderstanding the purpose of --enable-embedded-mode. I 
understand now, and also use the --enable-static and --disable-shared flags. I 
have been able to successfully compile into only a static library with headers 
on Linux, but Windows is still giving me some trouble.

I've installed MinGW and Cygwin, and specified HWLOC_MS_LIB as the path to the 
VS lib tool when running configure. Make works just fine (although the include 
directory isn't set properly, but that's easy to work around) until the Windows 
library linking stage:

C:\hwloc-1.3.1>make
Making all in src
make[1]: Entering directory `/cygdrive/c/hwloc-1.3.1/src'
  CC topology.lo
  CC traversal.lo
  CC distances.lo
  CC topology-synthetic.lo
  CC topology-xml.lo
  CC bind.lo
  CC cpuset.lo
  CC misc.lo
  CC topology-windows.lo
topology-windows.c: In function 'hwloc_win_get_VirtualAllocExNumaProc':
topology-windows.c:323:30: warning: assignment from incompatible pointer type [e
nabled by default]
topology-windows.c:328:28: warning: assignment from incompatible pointer type [e
nabled by default]
topology-windows.c: In function 'hwloc_look_windows':
topology-windows.c:469:36: warning: assignment from incompatible pointer type [e
nabled by default]
topology-windows.c:470:38: warning: assignment from incompatible pointer type [e
nabled by default]
  CCLD   libhwloc_embedded.la
copying selected object files to avoid basename conflicts...
  CCLD   libhwloc.la
libtool: link: warning: `-version-info/-version-number' is ignored for convenien
ce libraries
copying selected object files to avoid basename conflicts...
gcc -I/cygdrive/c/hwloc-1.3.1/include -I/cygdrive/c/hwloc-1.3.1/include -I/cygdr
ive/c/hwloc-1.3.1/includedolib.c   -o dolib
./dolib "/cygdrive/c/Program Files (x86)/Microsoft Visual Studio 10.0/VC/bin/lib
" X86 .libs/libhwloc.def libhwloc- .libs/libhwloc.lib
The system cannot find the path specified.
"/cygdrive/c/Program Files (x86)/Microsoft Visual Studio 10.0/VC/bin/lib" /machi
ne:X86 /def:.libs/libhwloc.def /name:libhwloc- /out:.libs/libhwloc.lib failed
Makefile:758: recipe for target `.libs/libhwloc.lib' failed
make[1]: *** [.libs/libhwloc.lib] Error 1
make[1]: Leaving directory `/cygdrive/c/hwloc-1.3.1/src'
Makefile:450: recipe for target `all-recursive' failed
make: *** [all-recursive] Error 1

If I run the command manually, it can't find the libhwloc.def file. Which is 
reasonable, as it does not appear to exist in the .lib directory. Am I missing 
something?

Thanks,

Andrew Helwer
Software Developer  - Intern
Acceleware Ltd. (TSX-V:AXE)
www.acceleware.com

Phone: +1.403.249.9099  ext. 348
Fax: +1.403.249.9881
Email: andrew.hel...@acceleware.com


Get superpowered! 
Acceleware gets your products to market faster, better and stronger!
 


This e-mail may contain information that is privileged and confidential and 
subject to legal restrictions and penalties regarding its unauthorized 
disclosure or use. You are prohibited from copying, distributing or otherwise 
using this information if you are not the intended recipient. If you have 
received this e-mail in error, please notify us immediately by return e-mail 
and delete this e-mail and its attachments from your system. Thank you.
© 2012 Acceleware Ltd., All Rights Reserved 






Re: [OMPI users] ompi + bash + GE + modules

2012-01-11 Thread Reuti
Hi,

Am 11.01.2012 um 19:12 schrieb Mark Suhovecky:

> Edmund-
> 
> Yeah, I've tried that.  No difference.  Our template .bash_profile sources
> the user's .bashrc, so non-interactive  bash shells in our setup are sourcing 
> .bashrc.

SGE 6.2u5 can't handle multi line environment variables or functions, it was 
fixed in 6.2u6 which isn't free. Do you use -V while submitting the job? Just 
ignore the error or look into Son of Gridengine which fixed it too.

If you can avoid -V, then it could be defined in any of the .profile or alike 
if you use -l as suggested. You could even define a started_method in SGE to 
define it for all users by default and avoid to use -V:

#!/bin/sh
module() { ...command...here... }
export -f module
exec "${@}"

-- Reuti


> The modules environment is defined, and works- only jonbs that run across 
> multiple machines see this error.
> 
> Mark Suhovecky
> HPC System Administrator
> Center for Research Computing
> University of Notre Dame
> suhove...@nd.edu
> 
> From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On Behalf Of 
> Edmund Sumbar [esum...@ualberta.ca]
> Sent: Wednesday, January 11, 2012 12:52 PM
> To: Open MPI Users
> Subject: Re: [OMPI users] ompi + bash + GE + modules
> 
> Hi Mark,
> 
> Have you tried adding -l to the #! line?
> 
> #!/bin/bash -l
> 
> On Wed, Jan 11, 2012 at 10:42 AM, Mark Suhovecky 
> > wrote:
> #!/bin/bash
> #$ 
> 
> module load ompi
> 
> mpiexec
> 
> when the mpiexec is run, we'll see the following errors
> 
> 
> bash: module: line 1: syntax error: unexpected end of file
> bash: error importing function definition for `module'
> 
> 
> 
> --
> Edmund Sumbar
> Research Computing Support
> University of Alberta
> +1 780 492 9360
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] ompi + bash + GE + modules

2012-01-11 Thread Gustavo Correa
Hi Mark

I wonder if you need to initialize the module command environment inside your 
SGE 
bash submission script:

$MODULESHOME/init/ 

where  is bash in this case.   See 'man module' for more details.

This would be before you actually invoke the module command:

module load openmpi

I am guessing your users' default shell is csh, and they may perhaps have a 
'csh'
module environment initialized by their .cshrc, but the job submission script 
is in bash.

Anyway, we use Torque, not SGE, so this is just a guess.

I hope it helps,
Gus Correa

On Jan 11, 2012, at 12:42 PM, Mark Suhovecky wrote:

> 
> Hi-
> 
> We run OpenMPI 1.4.3 on RHEL5 in a cluster environment.
> We use Univa Grid Engine 8.0.1 (an SGE spinoff) for job submission.
> We've just recently begun supporting the bash shell for submitted jobs,
> and are seeing a problem with submitted MPI jobs.
> 
> Our software environment is manged with Modules package (version 3.2.8),
> so a typical job submission looks something like this
> 
> #!/bin/bash
> #$ 
> 
> module load ompi
> 
> mpiexec 
> 
> when the mpiexec is run, we'll see the following errors
> 
> 
> bash: module: line 1: syntax error: unexpected end of file
> bash: error importing function definition for `module'
> 
> The module int file contains this function, which is what I'm assuming all 
> the fuss is about:
> 
> module() { eval `/opt/crc/Modules/$MODULE_VERSION/bin/modulecmd bash $*`; }
> export -f module
> 
> There will be multiple instances of the error generated- for example, if  I'm
> running a 48 core mpi-12 job spread across 4 machines,
> I'll see these errors printed 3 times. I don't see these errors
> on single-machine submitted jobs.
> 
> I've found posts for this error on bash, modules, and SGE lists, and have
> tried a number of suggested workarounds that all involve changing how I
> source modules (in /etc/profile.d, .bash_profile, via BASH_ENV), but
> none have gotten rid of this error.
> 
> Since we only see this problem with MPI, I figured it couldn't hurt to post
> here and see if any of you have had this symptom, and what your solution was.
> 
> I should mention that running a submitted MPI job under csh works just fine.
> 
> Thanks for any help,
> 
> Mark
> 
> Mark Suhovecky
> HPC System Administrator
> Center for Research Computing
> University of Notre Dame
> suhovecky at nd.edu
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] ompi + bash + GE + modules

2012-01-11 Thread Mark Suhovecky
Edmund-

Yeah, I've tried that.  No difference.  Our template .bash_profile sources
the user's .bashrc, so non-interactive  bash shells in our setup are sourcing 
.bashrc.

The modules environment is defined, and works- only jonbs that run across 
multiple machines see this error.

Mark Suhovecky
HPC System Administrator
Center for Research Computing
University of Notre Dame
suhove...@nd.edu

From: users-boun...@open-mpi.org [users-boun...@open-mpi.org] On Behalf Of 
Edmund Sumbar [esum...@ualberta.ca]
Sent: Wednesday, January 11, 2012 12:52 PM
To: Open MPI Users
Subject: Re: [OMPI users] ompi + bash + GE + modules

Hi Mark,

Have you tried adding -l to the #! line?

#!/bin/bash -l

On Wed, Jan 11, 2012 at 10:42 AM, Mark Suhovecky 
> wrote:
#!/bin/bash
#$ 

module load ompi

mpiexec

when the mpiexec is run, we'll see the following errors


bash: module: line 1: syntax error: unexpected end of file
bash: error importing function definition for `module'



--
Edmund Sumbar
Research Computing Support
University of Alberta
+1 780 492 9360




Re: [OMPI users] ompi + bash + GE + modules

2012-01-11 Thread Edmund Sumbar
Hi Mark,

Have you tried adding -l to the #! line?

#!/bin/bash -l

On Wed, Jan 11, 2012 at 10:42 AM, Mark Suhovecky  wrote:

> #!/bin/bash
> #$ 
>
> module load ompi
>
> mpiexec
>
> when the mpiexec is run, we'll see the following errors
>
>
> bash: module: line 1: syntax error: unexpected end of file
> bash: error importing function definition for `module'
>



-- 
Edmund Sumbar
Research Computing Support
University of Alberta
+1 780 492 9360


[OMPI users] ompi + bash + GE + modules

2012-01-11 Thread Mark Suhovecky

Hi-

We run OpenMPI 1.4.3 on RHEL5 in a cluster environment.
We use Univa Grid Engine 8.0.1 (an SGE spinoff) for job submission.
We've just recently begun supporting the bash shell for submitted jobs,
and are seeing a problem with submitted MPI jobs.

Our software environment is manged with Modules package (version 3.2.8),
so a typical job submission looks something like this

#!/bin/bash
#$ 

module load ompi

mpiexec 

when the mpiexec is run, we'll see the following errors


bash: module: line 1: syntax error: unexpected end of file
bash: error importing function definition for `module'

The module int file contains this function, which is what I'm assuming all the 
fuss is about:

module() { eval `/opt/crc/Modules/$MODULE_VERSION/bin/modulecmd bash $*`; }
export -f module

There will be multiple instances of the error generated- for example, if  I'm
running a 48 core mpi-12 job spread across 4 machines,
I'll see these errors printed 3 times. I don't see these errors
on single-machine submitted jobs.

I've found posts for this error on bash, modules, and SGE lists, and have
tried a number of suggested workarounds that all involve changing how I
source modules (in /etc/profile.d, .bash_profile, via BASH_ENV), but
none have gotten rid of this error.

Since we only see this problem with MPI, I figured it couldn't hurt to post
here and see if any of you have had this symptom, and what your solution was.

I should mention that running a submitted MPI job under csh works just fine.

Thanks for any help,

Mark

Mark Suhovecky
HPC System Administrator
Center for Research Computing
University of Notre Dame
suhovecky at nd.edu


Re: [OMPI users] Status of SLURM integration

2012-01-11 Thread Andrew Senin
Ralph, Jeff, thanks!

I managed to make it work with the following configure options:

 ./configure --with-pmi=/usr/ --with-slurm=/usr/ --without-psm
--prefix=`pwd`/install

Regards,
Andrew Senin

On Wed, Jan 11, 2012 at 7:17 PM, Ralph Castain  wrote:
> Well, yes - but it isn't quite that simple. :-/
>
> If you want to direct-launch on slurm without using the resv_ports option, 
> you need to build OMPI to include PMI support by including --with-pmi on your 
> configure cmd line. You may need to point to where pmi.h resides (e.g., 
> --with-pmi=/opt/slurm/include).
>
> We don't do that automatically because slurm's pmi.h is GPL, and so the 
> resulting binary is GPL. This isn't an issue if you are just using the binary 
> and not distributing it, but we chose to not surprise anyone.
>
> If you build the PMI support, then you can just srun your app without using 
> resv_ports.
>
> HTH
> Ralph
>
> On Jan 11, 2012, at 6:04 AM, Jeff Squyres wrote:
>
>> The latest -- 1.5.5rc2 (just released last night) -- has direct "srun 
>> my_mpi_application" integration.  It's not in a final release yet, but as 
>> you can probably guess by the version number, it'll be in the final version 
>> of 1.5.5.
>>
>> We have 1-2 bugs remaining in 1.5.5 that are actively being worked.  Once 
>> those are fixed (hopefully, in the Very Near Future), 1.5.5 will be released.
>>
>>
>> On Jan 10, 2012, at 11:38 PM, Andrew Senin wrote:
>>
>>> Hi,
>>>
>>> Could you please describe the current status of SLURM integration? I
>>> had a feeling srun supports direct launch of OpenMpi applications
>>> (without mpirun) compiled with the 1.5 branch.  At least one of my
>>> colleagu succeeded on that.
>>>
>>> But when I installed SLURM and the head revision of OpenMPI 1.5 branch
>>> I did not manage to run it without settings the SLURM_STEP_RESV_PORTS
>>> environment variable. I receive the following:
>>>
>>> orte_grpcomm_modex failed
>>> --> Returned "A message is attempting to be sent to a process whose
>>> contact information is unknown" (-117) instead of "Success" (0)
>>> --
>>> [mir9:25477] *** An error occurred in MPI_Init
>>> [mir9:25477] *** on a NULL communicator
>>> [mir9:25477] *** Unknown error
>>> [mir9:25477] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
>>>
>>> So I have 2 questions:
>>> 1. Is support of SLURM in the head revision of 1.5 branch stable
>>> enough to use it in the lab?
>>> 2. Does direct launch of mpi applications require setting the
>>> SLURM_STEP_RESV_PORTS environment variable?
>>>
>>> Thanks,
>>> Andrew Senin.
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>> --
>> Jeff Squyres
>> jsquy...@cisco.com
>> For corporate legal information go to:
>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Status of SLURM integration

2012-01-11 Thread Ralph Castain
Well, yes - but it isn't quite that simple. :-/

If you want to direct-launch on slurm without using the resv_ports option, you 
need to build OMPI to include PMI support by including --with-pmi on your 
configure cmd line. You may need to point to where pmi.h resides (e.g., 
--with-pmi=/opt/slurm/include).

We don't do that automatically because slurm's pmi.h is GPL, and so the 
resulting binary is GPL. This isn't an issue if you are just using the binary 
and not distributing it, but we chose to not surprise anyone.

If you build the PMI support, then you can just srun your app without using 
resv_ports.

HTH
Ralph

On Jan 11, 2012, at 6:04 AM, Jeff Squyres wrote:

> The latest -- 1.5.5rc2 (just released last night) -- has direct "srun 
> my_mpi_application" integration.  It's not in a final release yet, but as you 
> can probably guess by the version number, it'll be in the final version of 
> 1.5.5.
> 
> We have 1-2 bugs remaining in 1.5.5 that are actively being worked.  Once 
> those are fixed (hopefully, in the Very Near Future), 1.5.5 will be released.
> 
> 
> On Jan 10, 2012, at 11:38 PM, Andrew Senin wrote:
> 
>> Hi,
>> 
>> Could you please describe the current status of SLURM integration? I
>> had a feeling srun supports direct launch of OpenMpi applications
>> (without mpirun) compiled with the 1.5 branch.  At least one of my
>> colleagu succeeded on that.
>> 
>> But when I installed SLURM and the head revision of OpenMPI 1.5 branch
>> I did not manage to run it without settings the SLURM_STEP_RESV_PORTS
>> environment variable. I receive the following:
>> 
>> orte_grpcomm_modex failed
>> --> Returned "A message is attempting to be sent to a process whose
>> contact information is unknown" (-117) instead of "Success" (0)
>> --
>> [mir9:25477] *** An error occurred in MPI_Init
>> [mir9:25477] *** on a NULL communicator
>> [mir9:25477] *** Unknown error
>> [mir9:25477] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
>> 
>> So I have 2 questions:
>> 1. Is support of SLURM in the head revision of 1.5 branch stable
>> enough to use it in the lab?
>> 2. Does direct launch of mpi applications require setting the
>> SLURM_STEP_RESV_PORTS environment variable?
>> 
>> Thanks,
>> Andrew Senin.
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> -- 
> Jeff Squyres
> jsquy...@cisco.com
> For corporate legal information go to:
> http://www.cisco.com/web/about/doing_business/legal/cri/
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




Re: [OMPI users] using MPI_Recv in two different threads.

2012-01-11 Thread TERRY DONTJE
I am a little confused by your problem statement.  Are you saying you 
want to have each MPI process to have multiple threads that can call MPI 
concurrently?  If so you'll want to read up on the MPI_Init_thread 
function.


--td

On 1/11/2012 7:19 AM, Hamilton Fischer wrote:

Hi, I'm actually using mpi4py but my question should be similar to normal MPI 
in spirit.

Simply, I want to do a MPMD application with a dedicated thread for each node 
(I have a small network). I was wondering if it was okay to do a blocking recv 
in each independent thread. Of course, since each thread has one node, there is 
no problem with wrong recv's being picked up by other threads.


Thanks.

noobermin

___
users mailing list
us...@open-mpi.org
http://www.open-mpi.org/mailman/listinfo.cgi/users


--
Oracle
Terry D. Dontje | Principal Software Engineer
Developer Tools Engineering | +1.781.442.2631
Oracle *- Performance Technologies*
95 Network Drive, Burlington, MA 01803
Email terry.don...@oracle.com 





Re: [OMPI users] Status of SLURM integration

2012-01-11 Thread Jeff Squyres
The latest -- 1.5.5rc2 (just released last night) -- has direct "srun 
my_mpi_application" integration.  It's not in a final release yet, but as you 
can probably guess by the version number, it'll be in the final version of 
1.5.5.

We have 1-2 bugs remaining in 1.5.5 that are actively being worked.  Once those 
are fixed (hopefully, in the Very Near Future), 1.5.5 will be released.


On Jan 10, 2012, at 11:38 PM, Andrew Senin wrote:

> Hi,
> 
> Could you please describe the current status of SLURM integration? I
> had a feeling srun supports direct launch of OpenMpi applications
> (without mpirun) compiled with the 1.5 branch.  At least one of my
> colleagu succeeded on that.
> 
> But when I installed SLURM and the head revision of OpenMPI 1.5 branch
> I did not manage to run it without settings the SLURM_STEP_RESV_PORTS
> environment variable. I receive the following:
> 
>  orte_grpcomm_modex failed
>  --> Returned "A message is attempting to be sent to a process whose
> contact information is unknown" (-117) instead of "Success" (0)
> --
> [mir9:25477] *** An error occurred in MPI_Init
> [mir9:25477] *** on a NULL communicator
> [mir9:25477] *** Unknown error
> [mir9:25477] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort
> 
> So I have 2 questions:
> 1. Is support of SLURM in the head revision of 1.5 branch stable
> enough to use it in the lab?
> 2. Does direct launch of mpi applications require setting the
> SLURM_STEP_RESV_PORTS environment variable?
> 
> Thanks,
> Andrew Senin.
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/




[OMPI users] using MPI_Recv in two different threads.

2012-01-11 Thread Hamilton Fischer
Hi, I'm actually using mpi4py but my question should be similar to normal MPI 
in spirit.

Simply, I want to do a MPMD application with a dedicated thread for each node 
(I have a small network). I was wondering if it was okay to do a blocking recv 
in each independent thread. Of course, since each thread has one node, there is 
no problem with wrong recv's being picked up by other threads.


Thanks.

noobermin



Re: [OMPI users] Passwordless ssh

2012-01-11 Thread Reuti
Hi,

Am 11.01.2012 um 05:46 schrieb Ralph Castain:

> You might want to ask that on the Beowulf mailing lists - I suspect it has 
> something to do with the mount procedure, but honestly have no real idea how 
> to resolve it.
> 
> On Jan 10, 2012, at 8:45 PM, Shaandar Nyamtulga wrote:
> 
>> Hi
>> I built Beuwolf cluster using OpenMPI reading the following link.
>> http://techtinkering.com/2009/12/02/setting-up-a-beowulf-cluster-using-open-mpi-on-linux/
>> I can do ssh to my slave nodes without the slave mpiuser's password before 
>> mounting my slaves.
>> But when I mount my slaves and do ssh, the slaves ask again their passwords.
>> Master and slaves' ssh directory and authorized_keys have permission 700, 
>> 600 respectively and
>>  they owned only by owner mpiuser through chown.RSA has no passphrase.

it sounds like the ~/.ssh/authorized_keys on the master isn't containing its 
own public key (as in a plain sever you don't need it). Hence if you mount it 
on the slaves, it's missing again.

-- Reuti


>> Please help me on this matter.
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users




[OMPI users] Status of SLURM integration

2012-01-11 Thread Andrew Senin
Hi,

Could you please describe the current status of SLURM integration? I
had a feeling srun supports direct launch of OpenMpi applications
(without mpirun) compiled with the 1.5 branch.  At least one of my
colleagu succeeded on that.

But when I installed SLURM and the head revision of OpenMPI 1.5 branch
I did not manage to run it without settings the SLURM_STEP_RESV_PORTS
environment variable. I receive the following:

  orte_grpcomm_modex failed
  --> Returned "A message is attempting to be sent to a process whose
contact information is unknown" (-117) instead of "Success" (0)
--
[mir9:25477] *** An error occurred in MPI_Init
[mir9:25477] *** on a NULL communicator
[mir9:25477] *** Unknown error
[mir9:25477] *** MPI_ERRORS_ARE_FATAL: your MPI job will now abort

So I have 2 questions:
1. Is support of SLURM in the head revision of 1.5 branch stable
enough to use it in the lab?
2. Does direct launch of mpi applications require setting the
SLURM_STEP_RESV_PORTS environment variable?

Thanks,
Andrew Senin.