Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain
I did one "chapter" of it on Jeff's blog and probably should complete it. 
Definitely need to update the FAQ for the new options.

Sadly, outside of that and the mpirun man page, there isn't much available yet. 
I'm woefully far behind on it.


On Feb 26, 2014, at 4:47 PM, Saliya Ekanayake  wrote:

> Thank you Ralph, this is very insightful and I think I can better understand 
> performance of our application. 
> 
> If I may ask, is there a document describing this affinity options? I've been 
> looking at tuning FAQ and Jeff's blog posts.
> 
> Thank you,
> Saliya
> 
> 
> On Wed, Feb 26, 2014 at 7:34 PM, Ralph Castain  wrote:
> 
> On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake  wrote:
> 
>> I see, so if I understand correctly, the best scenario for threads would be 
>> to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in each 
>> proc. 
> 
> Yes, that would be the best solution. If you have 4 cores in each socket, 
> then just bind each proc to the socket:
> 
> --map-by socket --bind-to socket
> 
> If you want to put one proc on each socket by itself, then do
> 
> --map-by ppr:1:socket --bind-to socket
> 
> 
>> 
>> Also, as you've mentioned binding threads to get memory locality, I guess 
>> this has to be done at application level and not an option in OMPI
> 
> Sadly yes - the problem is that MPI lacks an init call for each thread, and 
> so we don't see the threads being started. You can use hwloc to bind each 
> thread, but it has to be done in the app itself.
> 
>> 
>> Thank you,
>> Saliya
>> 
>> 
>> On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain  wrote:
>> Sorry, had to run some errands.
>> 
>> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:
>> 
>>> Is it possible to bind to cores of multiple sockets? Say I have a machine 
>>> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I 
>>> utilize all 8 cores for 8 threads?
>> 
>> In that scenario, you won't get any benefit from binding as we only bind at 
>> the proc level (and binding to the entire node does nothing). You might want 
>> to bind your threads, however, as otherwise the threads will not necessarily 
>> execute local to any memory they malloc.
>> 
>>> 
>>> Thank you for speedy replies
>>> 
>>> Saliya
>>> 
>>> 
>>> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
>>> 
>>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
>>> 
 I have a followup question on this. In our application we have parallel 
 for loops similar to OMP parallel for. I noticed that in order to gain 
 speedup with threads I've to set --bind-to none, otherwise multiple 
 threads will bind to same core giving no increase in performance. For 
 example, I get following (attached) performance for a simple 3point 
 stencil computation run with T threads on 1 MPI process on 1 node (Tx1x1). 
 
 My understanding is even when there are multiple procs per node we should 
 use --bind-to none in order to get performance with threads. Is this 
 correct? Also, what's the disadvantage of not using --bind-to core?
>>> 
>>> Your best performance with threads comes when you bind each process to 
>>> multiple cores. Binding helps performance by ensuring your memory is always 
>>> local, and provides some optimized scheduling benefits. You can bind to 
>>> multiple cores by adding the qualifier "pe=N" to your mapping definition, 
>>> like this:
>>> 
>>> mpirun --map-by socket:pe=4 
>>> 
>>> The above example will map processes by socket, and bind each process to 4 
>>> cores.
>>> 
>>> HTH
>>> Ralph
>>> 
 
 Thank you,
 Saliya
 
 
 On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake  
 wrote:
 Thank you Ralph, I'll check this.
 
 
 On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
 It means that OMPI didn't get built against libnuma, and so we can't 
 ensure that memory is being bound local to the proc binding. Check to see 
 if numactl and numactl-devel are installed, or you can turn off the 
 warning using "-mca hwloc_base_mem_bind_failure_action silent"
 
 
 On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
 
> Hi,
> 
> I tried to run an MPI Java program with --bind-to core. I receive the 
> following warning and wonder how to fix this.
> 
> 
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
> 
>   Node:  192.168.0.19
> 
> This is a warning only; your job will continue, though performance may
> be degraded.
> 
> 
> Thank you,
> Saliya
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 

Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
Thank you Ralph, this is very insightful and I think I can better
understand performance of our application.

If I may ask, is there a document describing this affinity options? I've
been looking at tuning FAQ and Jeff's blog posts.

Thank you,
Saliya


On Wed, Feb 26, 2014 at 7:34 PM, Ralph Castain  wrote:

>
> On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake  wrote:
>
> I see, so if I understand correctly, the best scenario for threads would
> be to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in
> each proc.
>
>
> Yes, that would be the best solution. If you have 4 cores in each socket,
> then just bind each proc to the socket:
>
> --map-by socket --bind-to socket
>
> If you want to put one proc on each socket by itself, then do
>
> --map-by ppr:1:socket --bind-to socket
>
>
>
> Also, as you've mentioned binding threads to get memory locality, I guess
> this has to be done at application level and not an option in OMPI
>
>
> Sadly yes - the problem is that MPI lacks an init call for each thread,
> and so we don't see the threads being started. You can use hwloc to bind
> each thread, but it has to be done in the app itself.
>
>
> Thank you,
> Saliya
>
>
> On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain  wrote:
>
>> Sorry, had to run some errands.
>>
>> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:
>>
>> Is it possible to bind to cores of multiple sockets? Say I have a machine
>> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I
>> utilize all 8 cores for 8 threads?
>>
>>
>> In that scenario, you won't get any benefit from binding as we only bind
>> at the proc level (and binding to the entire node does nothing). You might
>> want to bind your threads, however, as otherwise the threads will not
>> necessarily execute local to any memory they malloc.
>>
>>
>> Thank you for speedy replies
>>
>> Saliya
>>
>>
>> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
>>
>>>
>>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake 
>>> wrote:
>>>
>>> I have a followup question on this. In our application we have parallel
>>> for loops similar to OMP parallel for. I noticed that in order to gain
>>> speedup with threads I've to set --bind-to none, otherwise multiple threads
>>> will bind to same core giving no increase in performance. For example, I
>>> get following (attached) performance for a simple 3point stencil
>>> computation run with T threads on 1 MPI process on 1 node (Tx1x1).
>>>
>>> My understanding is even when there are multiple procs per node we
>>> should use --bind-to none in order to get performance with threads. Is this
>>> correct? Also, what's the disadvantage of not using --bind-to core?
>>>
>>>
>>> Your best performance with threads comes when you bind each process to
>>> multiple cores. Binding helps performance by ensuring your memory is always
>>> local, and provides some optimized scheduling benefits. You can bind to
>>> multiple cores by adding the qualifier "pe=N" to your mapping definition,
>>> like this:
>>>
>>> mpirun --map-by socket:pe=4 
>>>
>>> The above example will map processes by socket, and bind each process to
>>> 4 cores.
>>>
>>> HTH
>>> Ralph
>>>
>>>
>>> Thank you,
>>> Saliya
>>>
>>>
>>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote:
>>>
 Thank you Ralph, I'll check this.


 On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote:

> It means that OMPI didn't get built against libnuma, and so we can't
> ensure that memory is being bound local to the proc binding. Check to see
> if numactl and numactl-devel are installed, or you can turn off the 
> warning
> using "-mca hwloc_base_mem_bind_failure_action silent"
>
>
> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake 
> wrote:
>
> Hi,
>
> I tried to run an MPI Java program with --bind-to core. I receive the
> following warning and wonder how to fix this.
>
>
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
>
>   Node:  192.168.0.19
>
> This is a warning only; your job will continue, though performance may
> be degraded.
>
>
> Thank you,
> Saliya
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>  ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



 --
 Saliya 

Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain

On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake  wrote:

> I see, so if I understand correctly, the best scenario for threads would be 
> to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in each 
> proc. 

Yes, that would be the best solution. If you have 4 cores in each socket, then 
just bind each proc to the socket:

--map-by socket --bind-to socket

If you want to put one proc on each socket by itself, then do

--map-by ppr:1:socket --bind-to socket


> 
> Also, as you've mentioned binding threads to get memory locality, I guess 
> this has to be done at application level and not an option in OMPI

Sadly yes - the problem is that MPI lacks an init call for each thread, and so 
we don't see the threads being started. You can use hwloc to bind each thread, 
but it has to be done in the app itself.

> 
> Thank you,
> Saliya
> 
> 
> On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain  wrote:
> Sorry, had to run some errands.
> 
> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:
> 
>> Is it possible to bind to cores of multiple sockets? Say I have a machine 
>> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I 
>> utilize all 8 cores for 8 threads?
> 
> In that scenario, you won't get any benefit from binding as we only bind at 
> the proc level (and binding to the entire node does nothing). You might want 
> to bind your threads, however, as otherwise the threads will not necessarily 
> execute local to any memory they malloc.
> 
>> 
>> Thank you for speedy replies
>> 
>> Saliya
>> 
>> 
>> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
>> 
>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
>> 
>>> I have a followup question on this. In our application we have parallel for 
>>> loops similar to OMP parallel for. I noticed that in order to gain speedup 
>>> with threads I've to set --bind-to none, otherwise multiple threads will 
>>> bind to same core giving no increase in performance. For example, I get 
>>> following (attached) performance for a simple 3point stencil computation 
>>> run with T threads on 1 MPI process on 1 node (Tx1x1). 
>>> 
>>> My understanding is even when there are multiple procs per node we should 
>>> use --bind-to none in order to get performance with threads. Is this 
>>> correct? Also, what's the disadvantage of not using --bind-to core?
>> 
>> Your best performance with threads comes when you bind each process to 
>> multiple cores. Binding helps performance by ensuring your memory is always 
>> local, and provides some optimized scheduling benefits. You can bind to 
>> multiple cores by adding the qualifier "pe=N" to your mapping definition, 
>> like this:
>> 
>> mpirun --map-by socket:pe=4 
>> 
>> The above example will map processes by socket, and bind each process to 4 
>> cores.
>> 
>> HTH
>> Ralph
>> 
>>> 
>>> Thank you,
>>> Saliya
>>> 
>>> 
>>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake  
>>> wrote:
>>> Thank you Ralph, I'll check this.
>>> 
>>> 
>>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
>>> It means that OMPI didn't get built against libnuma, and so we can't ensure 
>>> that memory is being bound local to the proc binding. Check to see if 
>>> numactl and numactl-devel are installed, or you can turn off the warning 
>>> using "-mca hwloc_base_mem_bind_failure_action silent"
>>> 
>>> 
>>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
>>> 
 Hi,
 
 I tried to run an MPI Java program with --bind-to core. I receive the 
 following warning and wonder how to fix this.
 
 
 WARNING: a request was made to bind a process. While the system
 supports binding the process itself, at least one node does NOT
 support binding memory to the process location.
 
   Node:  192.168.0.19
 
 This is a warning only; your job will continue, though performance may
 be degraded.
 
 
 Thank you,
 Saliya
 
 -- 
 Saliya Ekanayake esal...@gmail.com 
 Cell 812-391-4914 Home 812-961-6383
 http://saliya.org
 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> 
>>> -- 
>>> Saliya Ekanayake esal...@gmail.com 
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>> 
>>> 
>>> 
>>> -- 
>>> Saliya Ekanayake esal...@gmail.com 
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>> <3pointstencil.png>___
>>> 
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 

Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
I see, so if I understand correctly, the best scenario for threads would be
to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in
each proc.

Also, as you've mentioned binding threads to get memory locality, I guess
this has to be done at application level and not an option in OMPI

Thank you,
Saliya


On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain  wrote:

> Sorry, had to run some errands.
>
> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:
>
> Is it possible to bind to cores of multiple sockets? Say I have a machine
> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I
> utilize all 8 cores for 8 threads?
>
>
> In that scenario, you won't get any benefit from binding as we only bind
> at the proc level (and binding to the entire node does nothing). You might
> want to bind your threads, however, as otherwise the threads will not
> necessarily execute local to any memory they malloc.
>
>
> Thank you for speedy replies
>
> Saliya
>
>
> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
>
>>
>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
>>
>> I have a followup question on this. In our application we have parallel
>> for loops similar to OMP parallel for. I noticed that in order to gain
>> speedup with threads I've to set --bind-to none, otherwise multiple threads
>> will bind to same core giving no increase in performance. For example, I
>> get following (attached) performance for a simple 3point stencil
>> computation run with T threads on 1 MPI process on 1 node (Tx1x1).
>>
>> My understanding is even when there are multiple procs per node we should
>> use --bind-to none in order to get performance with threads. Is this
>> correct? Also, what's the disadvantage of not using --bind-to core?
>>
>>
>> Your best performance with threads comes when you bind each process to
>> multiple cores. Binding helps performance by ensuring your memory is always
>> local, and provides some optimized scheduling benefits. You can bind to
>> multiple cores by adding the qualifier "pe=N" to your mapping definition,
>> like this:
>>
>> mpirun --map-by socket:pe=4 
>>
>> The above example will map processes by socket, and bind each process to
>> 4 cores.
>>
>> HTH
>> Ralph
>>
>>
>> Thank you,
>> Saliya
>>
>>
>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote:
>>
>>> Thank you Ralph, I'll check this.
>>>
>>>
>>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote:
>>>
 It means that OMPI didn't get built against libnuma, and so we can't
 ensure that memory is being bound local to the proc binding. Check to see
 if numactl and numactl-devel are installed, or you can turn off the warning
 using "-mca hwloc_base_mem_bind_failure_action silent"


 On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake 
 wrote:

 Hi,

 I tried to run an MPI Java program with --bind-to core. I receive the
 following warning and wonder how to fix this.


 WARNING: a request was made to bind a process. While the system
 supports binding the process itself, at least one node does NOT
 support binding memory to the process location.

   Node:  192.168.0.19

 This is a warning only; your job will continue, though performance may
 be degraded.


 Thank you,
 Saliya

 --
 Saliya Ekanayake esal...@gmail.com
 Cell 812-391-4914 Home 812-961-6383
 http://saliya.org
  ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users



 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users

>>>
>>>
>>>
>>> --
>>> Saliya Ekanayake esal...@gmail.com
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>>
>>
>>
>>
>> --
>> Saliya Ekanayake esal...@gmail.com
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>>  <3pointstencil.png>___
>>
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>  ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383

Re: [OMPI users] OrangeFS ROMIO support

2014-02-26 Thread Edgar Gabriel
that was my fault, I did not follow up the time, got probably side
tracked by something. Anyway, I suspect that you actually have the
patch, otherwise the current Open MPI trunk and the 1.7 release series
would not have the patch after the last ROMIO update  - at least I did
not reapply it, not sure whether Nathan did.

Thanks
Edgar

On 2/26/2014 4:52 PM, Latham, Robert J. wrote:
> On Tue, 2014-02-25 at 07:26 -0600, Edgar Gabriel wrote:
>> this was/is a bug in ROMIO, in which they assume a datatype is an int. I
>> fixed it originally in a previous version of Open MPI on the trunk, but
>> it did not get ported upstream, so we might have to do the same fix again.
>>
> 
> Sorry about that.  I'm going through OpenMPI SVN now to see what other
> gems I may have dropped.
> 
> ==rob
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 

-- 
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  Fax: +1 (713) 743-3335



signature.asc
Description: OpenPGP digital signature


Re: [OMPI users] OrangeFS ROMIO support

2014-02-26 Thread Latham, Robert J.
On Tue, 2014-02-25 at 07:26 -0600, Edgar Gabriel wrote:
> this was/is a bug in ROMIO, in which they assume a datatype is an int. I
> fixed it originally in a previous version of Open MPI on the trunk, but
> it did not get ported upstream, so we might have to do the same fix again.
> 

Sorry about that.  I'm going through OpenMPI SVN now to see what other
gems I may have dropped.

==rob


Re: [OMPI users] Connection timed out with multiple nodes

2014-02-26 Thread Doug Roberts


o I should report there has been an important developement in this
problem, before anyone spends time on my previous post.  We have
got the original test program to run without hanging by directly
connecting the two test compute nodes together (thus bypassing the
switch) as shown here, where eth2 is still the 10G interface ie)

[roberpj@bro127:~/samples/openmpi/mpi_test] 
/opt/sharcnet/openmpi/1.6.5/intel-debug/bin/mpirun -np 2 --mca btl 
tcp,sm,self --mca btl_tcp_if_include eth2 --host bro127,bro128 ./a.out

Number of processes = 2
Test repeated 3 times for reliability
I am process 0 on node bro127
Run 1 of 3
P0: Sending to P1
P0: Waiting to receive from P1
P0: Received from to P1
Run 2 of 3
P0: Sending to P1
P0: Waiting to receive from P1
P0: Received from to P1
Run 3 of 3
P0: Sending to P1
P0: Waiting to receive from P1
P0: Received from to P1
P0: Done
I am process 1 on node bro128
P1: Waiting to receive from to P0
P1: Sending to to P0
P1: Waiting to receive from to P0
P1: Sending to to P0
P1: Waiting to receive from to P0
P1: Sending to to P0
P1: Done

o This now points to the Netgear XSM7224S 10G switch.  The firmware
version turns out to be slightly old at 9.0.1.14, so we will update
it to the latest 9.0.1.29 and then run the test again. I will report
back the result. In the meantime, if anyone knows of configuration
setting(s) in the switch that could block openmpi message passing
then please reply to this comment.  Tx!


-- Forwarded message --
List-Post: users@lists.open-mpi.org
Date: Tue, 25 Feb 2014 20:07:31 -0500 (EST)
From: Doug Roberts 
To: us...@open-mpi.org
Subject: Re: [OMPI users] Connection timed out with multiple nodes

Hello again, The "oob_stress" program runs cleanly on each of
the two test nodes bro127 and bro128 as shown below.  Would
you say this rules out a problem with the network and switch,
or is there another test program(s) that should be run next ?

o eth0 and eth2: without plm_base_verbose

[roberpj@bro127:~/samples/openmpi/oob_stress] mpirun -npernode 1 -mca 
oob_tcp_if_include eth0 ./oob_stress

[bro127:02020] Ring 1 message size 10 bytes
[bro127:02020] [[27318,1],0] Ring 1 completed
[bro127:02020] Ring 2 message size 100 bytes
[bro127:02020] [[27318,1],0] Ring 2 completed
[bro127:02020] Ring 3 message size 1000 bytes
[bro127:02020] [[27318,1],0] Ring 3 completed
[roberpj@bro127:~/samples/openmpi/oob_stress] mpirun -npernode 1 -mca 
oob_tcp_if_include eth2 ./oob_stress

[bro127:02022] Ring 1 message size 10 bytes
[bro127:02022] [[27312,1],0] Ring 1 completed
[bro127:02022] Ring 2 message size 100 bytes
[bro127:02022] [[27312,1],0] Ring 2 completed
[bro127:02022] Ring 3 message size 1000 bytes
[bro127:02022] [[27312,1],0] Ring 3 completed

[roberpj@bro128:~/samples/openmpi/oob_stress] mpirun -npernode 1 -mca 
oob_tcp_if_include eth0 ./oob_stress

[bro128:04484] Ring 1 message size 10 bytes
[bro128:04484] [[23046,1],0] Ring 1 completed
[bro128:04484] Ring 2 message size 100 bytes
[bro128:04484] [[23046,1],0] Ring 2 completed
[bro128:04484] Ring 3 message size 1000 bytes
[bro128:04484] [[23046,1],0] Ring 3 completed
[roberpj@bro128:~/samples/openmpi/oob_stress] mpirun -npernode 1 -mca 
oob_tcp_if_include eth2 ./oob_stress

[bro128:04486] Ring 1 message size 10 bytes
[bro128:04486] [[23040,1],0] Ring 1 completed
[bro128:04486] Ring 2 message size 100 bytes
[bro128:04486] [[23040,1],0] Ring 2 completed
[bro128:04486] Ring 3 message size 1000 bytes
[bro128:04486] [[23040,1],0] Ring 3 completed

o eth2: with plm_base_verbose on

[roberpj@bro127:~/samples/openmpi/oob_stress] mpirun -npernode 1 -mca 
oob_tcp_if_include eth2 -mca plm_base_verbose 5 ./oob_stress

[bro127:01936] mca:base:select:(  plm) Querying component [rsh]
[bro127:01936] [[INVALID],INVALID] plm:base:rsh_lookup on agent ssh : rsh path 
NULL
[bro127:01936] mca:base:select:(  plm) Query of component [rsh] set priority to 
10

[bro127:01936] mca:base:select:(  plm) Querying component [slurm]
[bro127:01936] mca:base:select:(  plm) Skipping component [slurm]. Query failed 
to return a module

[bro127:01936] mca:base:select:(  plm) Querying component [tm]
[bro127:01936] mca:base:select:(  plm) Skipping component [tm]. Query failed to 
return a module

[bro127:01936] mca:base:select:(  plm) Selected component [rsh]
[bro127:01936] plm:base:set_hnp_name: initial bias 1936 nodename hash 
3261509427

[bro127:01936] plm:base:set_hnp_name: final jobfam 27333
[bro127:01936] [[27333,0],0] plm:base:rsh_setup on agent ssh : rsh path NULL
[bro127:01936] [[27333,0],0] plm:base:receive start comm
[bro127:01936] released to spawn
[bro127:01936] [[27333,0],0] plm:base:setup_job for job [INVALID]
[bro127:01936] [[27333,0],0] plm:rsh: launching job [27333,1]
[bro127:01936] [[27333,0],0] plm:rsh: no new daemons to launch
[bro127:01936] [[27333,0],0] plm:base:launch_apps for job [27333,1]
[bro127:01936] [[27333,0],0] plm:base:report_launched for job [27333,1]
[bro127:01936] [[27333,0],0] 

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread Latham, Robert J.
On Wed, 2014-02-26 at 15:27 -0600, Edgar Gabriel wrote:
> ok, then this must be a difference between OrangeFS and PVFs2.  It turns
> out that trunk and 1.7 does actually have the patch, but 1.6 series does
> not have it. The actual commit was done in
> 
> https://svn.open-mpi.org/trac/ompi/changeset/24768

Oh man it's been nearly three  years and I haven't picked up that patch.
In my defense I did not know there were any OMPI changes I needed to
pick up into ROMIO.

==rob


Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain
Sorry, had to run some errands.

On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:

> Is it possible to bind to cores of multiple sockets? Say I have a machine 
> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I 
> utilize all 8 cores for 8 threads?

In that scenario, you won't get any benefit from binding as we only bind at the 
proc level (and binding to the entire node does nothing). You might want to 
bind your threads, however, as otherwise the threads will not necessarily 
execute local to any memory they malloc.

> 
> Thank you for speedy replies
> 
> Saliya
> 
> 
> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
> 
> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
> 
>> I have a followup question on this. In our application we have parallel for 
>> loops similar to OMP parallel for. I noticed that in order to gain speedup 
>> with threads I've to set --bind-to none, otherwise multiple threads will 
>> bind to same core giving no increase in performance. For example, I get 
>> following (attached) performance for a simple 3point stencil computation run 
>> with T threads on 1 MPI process on 1 node (Tx1x1). 
>> 
>> My understanding is even when there are multiple procs per node we should 
>> use --bind-to none in order to get performance with threads. Is this 
>> correct? Also, what's the disadvantage of not using --bind-to core?
> 
> Your best performance with threads comes when you bind each process to 
> multiple cores. Binding helps performance by ensuring your memory is always 
> local, and provides some optimized scheduling benefits. You can bind to 
> multiple cores by adding the qualifier "pe=N" to your mapping definition, 
> like this:
> 
> mpirun --map-by socket:pe=4 
> 
> The above example will map processes by socket, and bind each process to 4 
> cores.
> 
> HTH
> Ralph
> 
>> 
>> Thank you,
>> Saliya
>> 
>> 
>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake  wrote:
>> Thank you Ralph, I'll check this.
>> 
>> 
>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
>> It means that OMPI didn't get built against libnuma, and so we can't ensure 
>> that memory is being bound local to the proc binding. Check to see if 
>> numactl and numactl-devel are installed, or you can turn off the warning 
>> using "-mca hwloc_base_mem_bind_failure_action silent"
>> 
>> 
>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
>> 
>>> Hi,
>>> 
>>> I tried to run an MPI Java program with --bind-to core. I receive the 
>>> following warning and wonder how to fix this.
>>> 
>>> 
>>> WARNING: a request was made to bind a process. While the system
>>> supports binding the process itself, at least one node does NOT
>>> support binding memory to the process location.
>>> 
>>>   Node:  192.168.0.19
>>> 
>>> This is a warning only; your job will continue, though performance may
>>> be degraded.
>>> 
>>> 
>>> Thank you,
>>> Saliya
>>> 
>>> -- 
>>> Saliya Ekanayake esal...@gmail.com 
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> -- 
>> Saliya Ekanayake esal...@gmail.com 
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>> 
>> 
>> 
>> -- 
>> Saliya Ekanayake esal...@gmail.com 
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>> <3pointstencil.png>___
>> 
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread Edgar Gabriel
ok, then this must be a difference between OrangeFS and PVFs2.  It turns
out that trunk and 1.7 does actually have the patch, but 1.6 series does
not have it. The actual commit was done in

https://svn.open-mpi.org/trac/ompi/changeset/24768

and based on the line numbers, I think it should apply cleanly.
Using that patch, 1.6 compiles and executes without complaining about
PVFS2 support, but running my testsuite brings up a ton of data errors.

I am still digging a bit into that, but that's where I am right now.

Thanks
Edgar




On 2/26/2014 3:17 PM, vithanousek wrote:
> At first Thank you very much for your time.
> 
> "--with-file-system=pvfs2+ufs+nfs" didnt help.
> 
> But if find (by google) some part of orangefs test. I dont know what is
> this exactly doing, but when I edited source code of OpenMPI like doing
> this line, all seems that it is working now. (changing
> ADIOI_PVFS2_IReadContig and ADIOI_PVFS2_IWriteContig to NULL in file
> ad_pvfs2.c)
> 
> http://www.orangefs.org/trac/orangefs/browser/branches/OFSTest-dev/OFSTest/OFSTestNode.py?rev=10645#L1328
> 
> Other tests I will do tomorrow.
> 
> Thanks
> Hanousek Vít
> 
> 
> 
> -- Původní zpráva --
> Od: Edgar Gabriel 
> Komu: us...@open-mpi.org
> Datum: 26. 2. 2014 21:08:07
> Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> 
> 
> not sure whether its the problem or not, but usually have an additional
> flag set :
> 
> --with-io-romio-flags="--with-file-system=pvfs2+ufs+nfs
> --with-pvfs2=/opt/pvfs-2.8.2"
> 
> compilation is a bit slow for me today...
> 
> Edgar
> 
> 
> On 2/26/2014 2:05 PM, vithanousek wrote:
> > Now I compiled by doing this:
> > OrangeFS (original, withou editing):
> >
> > ./configure --prefix=/usr/local/orangefs
> > --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> > --with-openib=/usr --without-bmi-tcp --enable-shared
> > make
> > make kmod
> > make install
> > make kmod_install
> >
> > Without error.
> > OpenMPI (with edited switch to ifs):
> >
> > ./configure --prefix=/usr/local/openmpi_1.6.5_romio
> > --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs'
> > make
> > make install
> >
> > Without error.
> > parallel FS mount work. But I still cant use MPIIO.
> > I compiled simple MPIIO program and run it by this:
> >
> > mpicc -o mpiio mpiio.c
> > mpirun -np 1 -host node18 mpiio
> > [node18:02334] mca: base: component_find: unable to open
> > /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio:
> > /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio.so: undefined
> > symbol: ADIOI_PVFS2_IReadContig (ignored)
> >
> > And no file is created.
> > I tried compile it with:
> > mpicc -o mpiio mpiio.c -lpvfs2 -L/usr/local/orangefs/lib
> >
> > but i got the same results, have You any idea?
> >
> > Thank for reply
> > Hanousek Vít
> >
> >
> >
> >
> >
> > -- Původní zpráva --
> > Od: vithanousek 
> > Komu: Open MPI Users 
> > Datum: 26. 2. 2014 20:30:17
> > Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> >
> >
> > Thanks for your Time,
> >
> > I'm little bit confused, what is diferent between pvfs2 and
> > orangefs. I was thinking, that only project changes name.
> >
> > I get hint from OrangeFS maillist, to compile OrangeFs with
> > --enable-shared. This produce a some shared library (.so) in
> > /usr/local/orangefs/lib and I can compile OpenMPI 1.6.5 now (with
> > fixed "switch =>ifs" in ROMIO).
> >
> > I will test if it is working in next hour (some configuration steps
> > is needed).
> >
> > Thanks.
> > Hanousek Vít
> >
> > -- Původní zpráva --
> > Od: Edgar Gabriel 
> > Komu: Open MPI Users 
> > Datum: 26. 2. 2014 20:18:03
> > Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> >
> >
> > so we had ROMIO working with PVFS2 (not OrangeFS, which is however
> > registered as PVFS2 internally). We have one cluster which uses
> > OrangeFS, on that machine however we used OMPIO, not ROMIO. I am
> > currently compiling the 1.6 version of Open MPI to see whether I can
> > reproduce your problem.
> >
> > Thanks
> > Edgar
> >
> > On 2/26/2014 12:23 PM, vithanousek wrote:
> > > Thanks for reply,
> > >
> > > Is it possible that the patch solvs all this problems, not
> > only "switch
> > > => ifs" problem?
> > > I realy dont know, wher the problem is now (OpenMPI, ROMIO,
> > OrangeFS).
> > >
> > > Thanks
> > > Hanousek Vít
> > >
> > > -- Původní zpráva --
> > > Od: Ralph Castain 
> > > Komu: Open MPI Users 
>

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread vithanousek
At first Thank you very much for your time.

"--with-file-system=pvfs2+ufs+nfs" didnt help.

But if find (by google) some part of orangefs test. I dont know what is this
exactly doing, but when I edited source code of OpenMPI like doing this 
line, all seems that it is working now. (changing ADIOI_PVFS2_IReadContig 
and ADIOI_PVFS2_IWriteContig to NULL in file ad_pvfs2.c)

http://www.orangefs.org/trac/orangefs/browser/branches/OFSTest-dev/OFSTest/
OFSTestNode.py?rev=10645#L1328

Other tests I will do tomorrow.

Thanks
Hanousek Vít 




-- Původní zpráva --
Od: Edgar Gabriel 
Komu: us...@open-mpi.org
Datum: 26. 2. 2014 21:08:07
Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

"not sure whether its the problem or not, but usually have an additional
flag set :

--with-io-romio-flags="--with-file-system=pvfs2+ufs+nfs
--with-pvfs2=/opt/pvfs-2.8.2"

compilation is a bit slow for me today...

Edgar


On 2/26/2014 2:05 PM, vithanousek wrote:
> Now I compiled by doing this:
> OrangeFS (original, withou editing):
> 
> ./configure --prefix=/usr/local/orangefs
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> --with-openib=/usr --without-bmi-tcp --enable-shared
> make
> make kmod
> make install
> make kmod_install
> 
> Without error.
> OpenMPI (with edited switch to ifs):
> 
> ./configure --prefix=/usr/local/openmpi_1.6.5_romio
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs'
> make
> make install
> 
> Without error.
> parallel FS mount work. But I still cant use MPIIO.
> I compiled simple MPIIO program and run it by this:
> 
> mpicc -o mpiio mpiio.c
> mpirun -np 1 -host node18 mpiio
> [node18:02334] mca: base: component_find: unable to open
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio:
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio.so: undefined
> symbol: ADIOI_PVFS2_IReadContig (ignored)
> 
> And no file is created.
> I tried compile it with:
> mpicc -o mpiio mpiio.c -lpvfs2 -L/usr/local/orangefs/lib
> 
> but i got the same results, have You any idea?
> 
> Thank for reply
> Hanousek Vít
> 
> 
> 
> 
> 
> -- Původní zpráva --
> Od: vithanousek 
> Komu: Open MPI Users 
> Datum: 26. 2. 2014 20:30:17
> Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> 
> 
> Thanks for your Time,
> 
> I'm little bit confused, what is diferent between pvfs2 and
> orangefs. I was thinking, that only project changes name.
> 
> I get hint from OrangeFS maillist, to compile OrangeFs with
> --enable-shared. This produce a some shared library (.so) in
> /usr/local/orangefs/lib and I can compile OpenMPI 1.6.5 now (with
> fixed "switch =>ifs" in ROMIO).
> 
> I will test if it is working in next hour (some configuration steps
> is needed).
> 
> Thanks.
> Hanousek Vít
> 
> -- Původní zpráva --
> Od: Edgar Gabriel 
> Komu: Open MPI Users 
> Datum: 26. 2. 2014 20:18:03
> Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> 
> 
> so we had ROMIO working with PVFS2 (not OrangeFS, which is however
> registered as PVFS2 internally). We have one cluster which uses
> OrangeFS, on that machine however we used OMPIO, not ROMIO. I am
> currently compiling the 1.6 version of Open MPI to see whether I can
> reproduce your problem.
> 
> Thanks
> Edgar
> 
> On 2/26/2014 12:23 PM, vithanousek wrote:
> > Thanks for reply,
> >
> > Is it possible that the patch solvs all this problems, not
> only "switch
> > => ifs" problem?
> > I realy dont know, wher the problem is now (OpenMPI, ROMIO,
> OrangeFS).
> >
> > Thanks
> > Hanousek Vít
> >
> > -- Původní zpráva --
> > Od: Ralph Castain 
> > Komu: Open MPI Users 
> > Datum: 26. 2. 2014 19:16:36
> > Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> >
> >
> > Edgar hasn't had a chance to find the necessary patch - he was on
> > travel, returning soon.
> >
> >
> > On Feb 26, 2014, at 9:27 AM, vithanousek
>  wrote:
> >
> > > Hello,
> > >
> > > I have still problems with compiling OpenMPI 1.6.5 with OrangeFS
> > 2.8.7 support.
> > >
> > > I compiled OrangeFS by this:
> > >
> > > ./configure --prefix=/usr/local/orangefs2
> > --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> > --with-openib=/usr --without-bmi-tcp
> > > make -j 32
> > > make -j 32 kmod
> > > make install
> > > make kmod_install
> > >
> > > this works.
> > > than I tried to compile OpenMPI (with fixed convert_named
> function
> > in ad_pvfs2_io_dtype.c) by this:
> > >
> > > ./configure --prefix=/usr/local/openmpi_1.6.5_romio2
> > --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> > > (...)
> > > make -j32
> > > (...)
> > > CCLD mca_io_romio.la
> > > /usr/bin/ld:
> /usr/local/orangefs2/lib/libpvfs2.a(errno-mapping.o):
> > relocation R_X86_64_32S against `PINT_errno_mapping' can not
> be used
> > when making a shared object; recompile with -fPIC
> > > 

Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
Is it possible to bind to cores of multiple sockets? Say I have a machine
with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I
utilize all 8 cores for 8 threads?

Thank you for speedy replies

Saliya


On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:

>
> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
>
> I have a followup question on this. In our application we have parallel
> for loops similar to OMP parallel for. I noticed that in order to gain
> speedup with threads I've to set --bind-to none, otherwise multiple threads
> will bind to same core giving no increase in performance. For example, I
> get following (attached) performance for a simple 3point stencil
> computation run with T threads on 1 MPI process on 1 node (Tx1x1).
>
> My understanding is even when there are multiple procs per node we should
> use --bind-to none in order to get performance with threads. Is this
> correct? Also, what's the disadvantage of not using --bind-to core?
>
>
> Your best performance with threads comes when you bind each process to
> multiple cores. Binding helps performance by ensuring your memory is always
> local, and provides some optimized scheduling benefits. You can bind to
> multiple cores by adding the qualifier "pe=N" to your mapping definition,
> like this:
>
> mpirun --map-by socket:pe=4 
>
> The above example will map processes by socket, and bind each process to 4
> cores.
>
> HTH
> Ralph
>
>
> Thank you,
> Saliya
>
>
> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote:
>
>> Thank you Ralph, I'll check this.
>>
>>
>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
>>
>>> It means that OMPI didn't get built against libnuma, and so we can't
>>> ensure that memory is being bound local to the proc binding. Check to see
>>> if numactl and numactl-devel are installed, or you can turn off the warning
>>> using "-mca hwloc_base_mem_bind_failure_action silent"
>>>
>>>
>>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake 
>>> wrote:
>>>
>>> Hi,
>>>
>>> I tried to run an MPI Java program with --bind-to core. I receive the
>>> following warning and wonder how to fix this.
>>>
>>>
>>> WARNING: a request was made to bind a process. While the system
>>> supports binding the process itself, at least one node does NOT
>>> support binding memory to the process location.
>>>
>>>   Node:  192.168.0.19
>>>
>>> This is a warning only; your job will continue, though performance may
>>> be degraded.
>>>
>>>
>>> Thank you,
>>> Saliya
>>>
>>> --
>>> Saliya Ekanayake esal...@gmail.com
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>>  ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>>
>> --
>> Saliya Ekanayake esal...@gmail.com
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>>
>
>
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>  <3pointstencil.png>___
>
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org


Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain

On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:

> I have a followup question on this. In our application we have parallel for 
> loops similar to OMP parallel for. I noticed that in order to gain speedup 
> with threads I've to set --bind-to none, otherwise multiple threads will bind 
> to same core giving no increase in performance. For example, I get following 
> (attached) performance for a simple 3point stencil computation run with T 
> threads on 1 MPI process on 1 node (Tx1x1). 
> 
> My understanding is even when there are multiple procs per node we should use 
> --bind-to none in order to get performance with threads. Is this correct? 
> Also, what's the disadvantage of not using --bind-to core?

Your best performance with threads comes when you bind each process to multiple 
cores. Binding helps performance by ensuring your memory is always local, and 
provides some optimized scheduling benefits. You can bind to multiple cores by 
adding the qualifier "pe=N" to your mapping definition, like this:

mpirun --map-by socket:pe=4 

The above example will map processes by socket, and bind each process to 4 
cores.

HTH
Ralph

> 
> Thank you,
> Saliya
> 
> 
> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake  wrote:
> Thank you Ralph, I'll check this.
> 
> 
> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
> It means that OMPI didn't get built against libnuma, and so we can't ensure 
> that memory is being bound local to the proc binding. Check to see if numactl 
> and numactl-devel are installed, or you can turn off the warning using "-mca 
> hwloc_base_mem_bind_failure_action silent"
> 
> 
> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
> 
>> Hi,
>> 
>> I tried to run an MPI Java program with --bind-to core. I receive the 
>> following warning and wonder how to fix this.
>> 
>> 
>> WARNING: a request was made to bind a process. While the system
>> supports binding the process itself, at least one node does NOT
>> support binding memory to the process location.
>> 
>>   Node:  192.168.0.19
>> 
>> This is a warning only; your job will continue, though performance may
>> be degraded.
>> 
>> 
>> Thank you,
>> Saliya
>> 
>> -- 
>> Saliya Ekanayake esal...@gmail.com 
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
> 
> 
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
> <3pointstencil.png>___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
I have a followup question on this. In our application we have parallel for
loops similar to OMP parallel for. I noticed that in order to gain speedup
with threads I've to set --bind-to none, otherwise multiple threads will
bind to same core giving no increase in performance. For example, I get
following (attached) performance for a simple 3point stencil computation
run with T threads on 1 MPI process on 1 node (Tx1x1).

My understanding is even when there are multiple procs per node we should
use --bind-to none in order to get performance with threads. Is this
correct? Also, what's the disadvantage of not using --bind-to core?

Thank you,
Saliya


On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote:

> Thank you Ralph, I'll check this.
>
>
> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
>
>> It means that OMPI didn't get built against libnuma, and so we can't
>> ensure that memory is being bound local to the proc binding. Check to see
>> if numactl and numactl-devel are installed, or you can turn off the warning
>> using "-mca hwloc_base_mem_bind_failure_action silent"
>>
>>
>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
>>
>> Hi,
>>
>> I tried to run an MPI Java program with --bind-to core. I receive the
>> following warning and wonder how to fix this.
>>
>>
>> WARNING: a request was made to bind a process. While the system
>> supports binding the process itself, at least one node does NOT
>> support binding memory to the process location.
>>
>>   Node:  192.168.0.19
>>
>> This is a warning only; your job will continue, though performance may
>> be degraded.
>>
>>
>> Thank you,
>> Saliya
>>
>> --
>> Saliya Ekanayake esal...@gmail.com
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>>  ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>



-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org


Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread Edgar Gabriel
not sure whether its the problem or not, but usually have an additional
flag set :

 --with-io-romio-flags="--with-file-system=pvfs2+ufs+nfs
--with-pvfs2=/opt/pvfs-2.8.2"

compilation is a bit slow for me today...

Edgar


On 2/26/2014 2:05 PM, vithanousek wrote:
> Now I compiled by doing this:
> OrangeFS (original, withou editing):
> 
> ./configure --prefix=/usr/local/orangefs
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> --with-openib=/usr --without-bmi-tcp --enable-shared
> make
> make kmod
> make install
> make kmod_install
> 
> Without error.
> OpenMPI (with edited switch to ifs):
> 
> ./configure --prefix=/usr/local/openmpi_1.6.5_romio
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs'
> make
> make install
> 
> Without error.
> parallel FS mount work. But I still cant use MPIIO.
> I compiled simple MPIIO program and run it by this:
> 
> mpicc -o mpiio mpiio.c
> mpirun -np 1 -host node18 mpiio
> [node18:02334] mca: base: component_find: unable to open
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio:
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio.so: undefined
> symbol: ADIOI_PVFS2_IReadContig (ignored)
> 
> And no file is created.
> I tried compile it with:
> mpicc -o mpiio mpiio.c -lpvfs2 -L/usr/local/orangefs/lib
> 
> but i got the same results, have You any idea?
> 
> Thank for reply
> Hanousek Vít
> 
> 
> 
> 
> 
> -- Původní zpráva --
> Od: vithanousek 
> Komu: Open MPI Users 
> Datum: 26. 2. 2014 20:30:17
> Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> 
> 
> Thanks for your Time,
> 
> I'm little bit confused, what is diferent between pvfs2 and
> orangefs. I was thinking, that only project changes name.
> 
> I get hint from OrangeFS maillist, to compile OrangeFs with
> --enable-shared. This produce a some shared library (.so) in
> /usr/local/orangefs/lib and I can compile OpenMPI 1.6.5 now (with
> fixed "switch =>ifs" in ROMIO).
> 
> I will test if it is working in next hour (some configuration steps
> is needed).
> 
> Thanks.
> Hanousek Vít
> 
> -- Původní zpráva --
> Od: Edgar Gabriel 
> Komu: Open MPI Users 
> Datum: 26. 2. 2014 20:18:03
> Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> 
> 
> so we had ROMIO working with PVFS2 (not OrangeFS, which is however
> registered as PVFS2 internally). We have one cluster which uses
> OrangeFS, on that machine however we used OMPIO, not ROMIO. I am
> currently compiling the 1.6 version of Open MPI to see whether I can
> reproduce your problem.
> 
> Thanks
> Edgar
> 
> On 2/26/2014 12:23 PM, vithanousek wrote:
> > Thanks for reply,
> >
> > Is it possible that the patch solvs all this problems, not
> only "switch
> > => ifs" problem?
> > I realy dont know, wher the problem is now (OpenMPI, ROMIO,
> OrangeFS).
> >
> > Thanks
> > Hanousek Vít
> >
> > -- Původní zpráva --
> > Od: Ralph Castain 
> > Komu: Open MPI Users 
> > Datum: 26. 2. 2014 19:16:36
> > Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> >
> >
> > Edgar hasn't had a chance to find the necessary patch - he was on
> > travel, returning soon.
> >
> >
> > On Feb 26, 2014, at 9:27 AM, vithanousek
>  wrote:
> >
> > > Hello,
> > >
> > > I have still problems with compiling OpenMPI 1.6.5 with OrangeFS
> > 2.8.7 support.
> > >
> > > I compiled OrangeFS by this:
> > >
> > > ./configure --prefix=/usr/local/orangefs2
> > --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> > --with-openib=/usr --without-bmi-tcp
> > > make -j 32
> > > make -j 32 kmod
> > > make install
> > > make kmod_install
> > >
> > > this works.
> > > than I tried to compile OpenMPI (with fixed convert_named
> function
> > in ad_pvfs2_io_dtype.c) by this:
> > >
> > > ./configure --prefix=/usr/local/openmpi_1.6.5_romio2
> > --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> > > (...)
> > > make -j32
> > > (...)
> > > CCLD mca_io_romio.la
> > > /usr/bin/ld:
> /usr/local/orangefs2/lib/libpvfs2.a(errno-mapping.o):
> > relocation R_X86_64_32S against `PINT_errno_mapping' can not
> be used
> > when making a shared object; recompile with -fPIC
> > > /usr/local/orangefs2/lib/libpvfs2.a: could not read symbols:
> Bad value
> > > collect2: ld returned 1 exit status
> > > make[3]: *** 

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread vithanousek
Now I compiled by doing this:
OrangeFS (original, withou editing):

./configure --prefix=/usr/local/orangefs --with-kernel=/usr/src/kernels/2.6.
32-431.5.1.el6.x86_64 --with-openib=/usr --without-bmi-tcp --enable-shared
make 
make kmod
make install
make kmod_install

Without error.
OpenMPI (with edited switch to ifs):

./configure --prefix=/usr/local/openmpi_1.6.5_romio --with-io-romio-flags='-
-with-pvfs2=/usr/local/orangefs'
make 
make install

Without error.
parallel FS mount work. But I still cant use MPIIO. 
I compiled simple MPIIO program and run it by this:

mpicc -o mpiio mpiio.c
mpirun -np 1 -host node18 mpiio
[node18:02334] mca: base: component_find: unable to open /usr/local/openmpi_
1.6.5_romio/lib/openmpi/mca_io_romio: /usr/local/openmpi_1.6.5_romio/lib/
openmpi/mca_io_romio.so: undefined symbol: ADIOI_PVFS2_IReadContig (ignored)

And no file is created.
I tried compile it with:
mpicc -o mpiio mpiio.c -lpvfs2 -L/usr/local/orangefs/lib

but i got the same results, have You any idea?

Thank for reply
Hanousek Vít






-- Původní zpráva --
Od: vithanousek 
Komu: Open MPI Users 
Datum: 26. 2. 2014 20:30:17
Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

"
Thanks for your Time,

I'm little bit confused, what is diferent between pvfs2 and orangefs. I was 
thinking, that only project changes name.

I get hint from OrangeFS maillist, to compile OrangeFs with --enable-shared.
This produce a some shared library (.so) in /usr/local/orangefs/lib and I 
can compile OpenMPI 1.6.5 now (with fixed "switch =>ifs" in ROMIO).

I will test if it is working in next hour (some configuration steps is 
needed).

Thanks.
Hanousek Vít


-- Původní zpráva --
Od: Edgar Gabriel 
Komu: Open MPI Users 
Datum: 26. 2. 2014 20:18:03
Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

"so we had ROMIO working with PVFS2 (not OrangeFS, which is however
registered as PVFS2 internally). We have one cluster which uses
OrangeFS, on that machine however we used OMPIO, not ROMIO. I am
currently compiling the 1.6 version of Open MPI to see whether I can
reproduce your problem.

Thanks
Edgar

On 2/26/2014 12:23 PM, vithanousek wrote:
> Thanks for reply,
> 
> Is it possible that the patch solvs all this problems, not only "switch
> => ifs" problem?
> I realy dont know, wher the problem is now (OpenMPI, ROMIO, OrangeFS).
> 
> Thanks
> Hanousek Vít
> 
> -- Původní zpráva --
> Od: Ralph Castain 
> Komu: Open MPI Users 
> Datum: 26. 2. 2014 19:16:36
> Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> 
> 
> Edgar hasn't had a chance to find the necessary patch - he was on
> travel, returning soon.
> 
> 
> On Feb 26, 2014, at 9:27 AM, vithanousek  wrote:
> 
> > Hello,
> >
> > I have still problems with compiling OpenMPI 1.6.5 with OrangeFS
> 2.8.7 support.
> >
> > I compiled OrangeFS by this:
> >
> > ./configure --prefix=/usr/local/orangefs2
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> --with-openib=/usr --without-bmi-tcp
> > make -j 32
> > make -j 32 kmod
> > make install
> > make kmod_install
> >
> > this works.
> > than I tried to compile OpenMPI (with fixed convert_named function
> in ad_pvfs2_io_dtype.c) by this:
> >
> > ./configure --prefix=/usr/local/openmpi_1.6.5_romio2
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> > (...)
> > make -j32
> > (...)
> > CCLD mca_io_romio.la
> > /usr/bin/ld: /usr/local/orangefs2/lib/libpvfs2.a(errno-mapping.o):
> relocation R_X86_64_32S against `PINT_errno_mapping' can not be used
> when making a shared object; recompile with -fPIC
> > /usr/local/orangefs2/lib/libpvfs2.a: could not read symbols: Bad value
> > collect2: ld returned 1 exit status
> > make[3]: *** [mca_io_romio.la] Error 1
> >
> > So I tried recompile OrangeFS by this:
> >
> > export CFLAGS="-fPIC"
> > ./configure --prefix=/usr/local/orangefs2
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> --with-openib=/usr --without-bmi-tcp
> > make -j 32
> > make -j 32 kmod
> > make install
> > make kmod_install
> >
> > (there was errors with current->fsuid => current->cred->fsuid, in
> multiple files. I hardcoded this in files, bad idea I know )
> > Then compilation of OpenMPI works.
> >
> > ./configure --prefix=/usr/local/openmpi_1.6.5_romio2
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> > make -j32
> > make install
> >
> > but when i created simple program which is using MPIIO, it failed
> when i run it:
> >
> > mpirun -np 1 -host node18 mpiio
> > [node18:01696] mca: base: component_find: unable to open
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio:
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio.so:
> undefined symbol: ADIOI_PVFS2_IReadContig (ignored)
> >
> > Because I got message form OrangeFS mailing list about -fPIC
> errors, i tryed to recompile OrangeFS withou this flag 

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread vithanousek
Thanks for your Time,

I'm little bit confused, what is diferent between pvfs2 and orangefs. I was 
thinking, that only project changes name.

I get hint from OrangeFS maillist, to compile OrangeFs with --enable-shared.
This produce a some shared library (.so) in /usr/local/orangefs/lib and I 
can compile OpenMPI 1.6.5 now (with fixed "switch =>ifs" in ROMIO).

I will test if it is working in next hour (some configuration steps is 
needed).

Thanks.
Hanousek Vít


-- Původní zpráva --
Od: Edgar Gabriel 
Komu: Open MPI Users 
Datum: 26. 2. 2014 20:18:03
Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

"so we had ROMIO working with PVFS2 (not OrangeFS, which is however
registered as PVFS2 internally). We have one cluster which uses
OrangeFS, on that machine however we used OMPIO, not ROMIO. I am
currently compiling the 1.6 version of Open MPI to see whether I can
reproduce your problem.

Thanks
Edgar

On 2/26/2014 12:23 PM, vithanousek wrote:
> Thanks for reply,
> 
> Is it possible that the patch solvs all this problems, not only "switch
> => ifs" problem?
> I realy dont know, wher the problem is now (OpenMPI, ROMIO, OrangeFS).
> 
> Thanks
> Hanousek Vít
> 
> -- Původní zpráva --
> Od: Ralph Castain 
> Komu: Open MPI Users 
> Datum: 26. 2. 2014 19:16:36
> Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> 
> 
> Edgar hasn't had a chance to find the necessary patch - he was on
> travel, returning soon.
> 
> 
> On Feb 26, 2014, at 9:27 AM, vithanousek  wrote:
> 
> > Hello,
> >
> > I have still problems with compiling OpenMPI 1.6.5 with OrangeFS
> 2.8.7 support.
> >
> > I compiled OrangeFS by this:
> >
> > ./configure --prefix=/usr/local/orangefs2
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> --with-openib=/usr --without-bmi-tcp
> > make -j 32
> > make -j 32 kmod
> > make install
> > make kmod_install
> >
> > this works.
> > than I tried to compile OpenMPI (with fixed convert_named function
> in ad_pvfs2_io_dtype.c) by this:
> >
> > ./configure --prefix=/usr/local/openmpi_1.6.5_romio2
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> > (...)
> > make -j32
> > (...)
> > CCLD mca_io_romio.la
> > /usr/bin/ld: /usr/local/orangefs2/lib/libpvfs2.a(errno-mapping.o):
> relocation R_X86_64_32S against `PINT_errno_mapping' can not be used
> when making a shared object; recompile with -fPIC
> > /usr/local/orangefs2/lib/libpvfs2.a: could not read symbols: Bad value
> > collect2: ld returned 1 exit status
> > make[3]: *** [mca_io_romio.la] Error 1
> >
> > So I tried recompile OrangeFS by this:
> >
> > export CFLAGS="-fPIC"
> > ./configure --prefix=/usr/local/orangefs2
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> --with-openib=/usr --without-bmi-tcp
> > make -j 32
> > make -j 32 kmod
> > make install
> > make kmod_install
> >
> > (there was errors with current->fsuid => current->cred->fsuid, in
> multiple files. I hardcoded this in files, bad idea I know )
> > Then compilation of OpenMPI works.
> >
> > ./configure --prefix=/usr/local/openmpi_1.6.5_romio2
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> > make -j32
> > make install
> >
> > but when i created simple program which is using MPIIO, it failed
> when i run it:
> >
> > mpirun -np 1 -host node18 mpiio
> > [node18:01696] mca: base: component_find: unable to open
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio:
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio.so:
> undefined symbol: ADIOI_PVFS2_IReadContig (ignored)
> >
> > Because I got message form OrangeFS mailing list about -fPIC
> errors, i tryed to recompile OrangeFS withou this flag and compile
> OpenMPI (static linked) by this:
> >
> > ./congure --prefix=/usr/local/openmpi_1.6.5_romio2
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> --enable-static --disable-shared
> > (...)
> > make -j 32
> > (...)
> > CCLD otfmerge-mpi
> >
> /root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.
o):(.data+0x60):
> undefined reference to `ADIOI_PVFS2_IReadContig'
> >
> /root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.
o):(.data+0x68):
> undefined reference to `ADIOI_PVFS2_IWriteContig'
> > collect2: ld returned 1 exit status
> > make[10]: *** [otfmerge-mpi] Error 1
> > (...)
> >
> > Now I realy dont know, what is wrong.
> > Is there Anybody ho has OpenMPI working with OrangeFS?
> >
> > Thanks for replies
> > HanousekVít
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread Edgar Gabriel
so we had ROMIO working with PVFS2 (not OrangeFS, which is however
registered as PVFS2 internally). We have one cluster which uses
OrangeFS, on that machine however we used OMPIO, not ROMIO. I am
currently compiling the 1.6 version of Open MPI to see whether I can
reproduce your problem.

Thanks
Edgar

On 2/26/2014 12:23 PM, vithanousek wrote:
> Thanks for reply,
> 
> Is it possible that the patch solvs all this problems, not only "switch
> => ifs" problem?
> I realy dont know, wher the problem is now (OpenMPI, ROMIO, OrangeFS).
> 
> Thanks
> Hanousek Vít
> 
> -- Původní zpráva --
> Od: Ralph Castain 
> Komu: Open MPI Users 
> Datum: 26. 2. 2014 19:16:36
> Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS
> 
> 
> Edgar hasn't had a chance to find the necessary patch - he was on
> travel, returning soon.
> 
> 
> On Feb 26, 2014, at 9:27 AM, vithanousek  wrote:
> 
> > Hello,
> >
> > I have still problems with compiling OpenMPI 1.6.5 with OrangeFS
> 2.8.7 support.
> >
> > I compiled OrangeFS by this:
> >
> > ./configure --prefix=/usr/local/orangefs2
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> --with-openib=/usr --without-bmi-tcp
> > make -j 32
> > make -j 32 kmod
> > make install
> > make kmod_install
> >
> > this works.
> > than I tried to compile OpenMPI (with fixed convert_named function
> in ad_pvfs2_io_dtype.c) by this:
> >
> > ./configure --prefix=/usr/local/openmpi_1.6.5_romio2
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> > (...)
> > make -j32
> > (...)
> > CCLD mca_io_romio.la
> > /usr/bin/ld: /usr/local/orangefs2/lib/libpvfs2.a(errno-mapping.o):
> relocation R_X86_64_32S against `PINT_errno_mapping' can not be used
> when making a shared object; recompile with -fPIC
> > /usr/local/orangefs2/lib/libpvfs2.a: could not read symbols: Bad value
> > collect2: ld returned 1 exit status
> > make[3]: *** [mca_io_romio.la] Error 1
> >
> > So I tried recompile OrangeFS by this:
> >
> > export CFLAGS="-fPIC"
> > ./configure --prefix=/usr/local/orangefs2
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64
> --with-openib=/usr --without-bmi-tcp
> > make -j 32
> > make -j 32 kmod
> > make install
> > make kmod_install
> >
> > (there was errors with current->fsuid => current->cred->fsuid, in
> multiple files. I hardcoded this in files, bad idea I know )
> > Then compilation of OpenMPI works.
> >
> > ./configure --prefix=/usr/local/openmpi_1.6.5_romio2
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> > make -j32
> > make install
> >
> > but when i created simple program which is using MPIIO, it failed
> when i run it:
> >
> > mpirun -np 1 -host node18 mpiio
> > [node18:01696] mca: base: component_find: unable to open
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio:
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio.so:
> undefined symbol: ADIOI_PVFS2_IReadContig (ignored)
> >
> > Because I got message form OrangeFS mailing list about -fPIC
> errors, i tryed to recompile OrangeFS withou this flag and compile
> OpenMPI (static linked) by this:
> >
> > ./congure --prefix=/usr/local/openmpi_1.6.5_romio2
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> --enable-static --disable-shared
> > (...)
> > make -j 32
> > (...)
> > CCLD otfmerge-mpi
> >
> 
> /root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.o):(.data+0x60):
> undefined reference to `ADIOI_PVFS2_IReadContig'
> >
> 
> /root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.o):(.data+0x68):
> undefined reference to `ADIOI_PVFS2_IWriteContig'
> > collect2: ld returned 1 exit status
> > make[10]: *** [otfmerge-mpi] Error 1
> > (...)
> >
> > Now I realy dont know, what is wrong.
> > Is there Anybody ho has OpenMPI working with OrangeFS?
> >
> > Thanks for replies
> > HanousekVít
> > ___
> > users mailing list
> > us...@open-mpi.org
> > http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 

-- 
Edgar Gabriel
Associate Professor
Parallel Software Technologies Lab  http://pstl.cs.uh.edu
Department of Computer Science  University of Houston
Philip G. Hoffman Hall, Room 524Houston, TX-77204, USA
Tel: +1 (713) 743-3857  

Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread vithanousek
Thanks for reply,



Is it possible that the patch solvs all this problems, not only "switch => 
ifs" problem?

I realy dont know, wher the problem is now (OpenMPI, ROMIO, OrangeFS).



Thanks

Hanousek Vít


-- Původní zpráva --

Od: Ralph Castain 

Komu: Open MPI Users 

Datum: 26. 2. 2014 19:16:36

Předmět: Re: [OMPI users] OpenMPI-ROMIO-OrangeFS


"Edgar hasn't had a chance to find the necessary patch - he was on travel, 
returning soon.





On Feb 26, 2014, at 9:27 AM, vithanousek  wrote:



> Hello,

> 

> I have still problems with compiling OpenMPI 1.6.5 with OrangeFS 2.8.7 
support.

> 

> I compiled OrangeFS by this:

> 

> ./configure --prefix=/usr/local/orangefs2 --with-kernel=/usr/src/kernels/
2.6.32-431.5.1.el6.x86_64 --with-openib=/usr --without-bmi-tcp

> make -j 32

> make -j 32 kmod

> make install

> make kmod_install

> 

> this works.

> than I tried to compile OpenMPI (with fixed convert_named function in ad_
pvfs2_io_dtype.c) by this:

> 

> ./configure --prefix=/usr/local/openmpi_1.6.5_romio2 --with-io-romio-flags
='--with-pvfs2=/usr/local/orangefs2'

> (...)

> make -j32

> (...)

> CCLD mca_io_romio.la

> /usr/bin/ld: /usr/local/orangefs2/lib/libpvfs2.a(errno-mapping.o): 
relocation R_X86_64_32S against `PINT_errno_mapping' can not be used when 
making a shared object; recompile with -fPIC

> /usr/local/orangefs2/lib/libpvfs2.a: could not read symbols: Bad value

> collect2: ld returned 1 exit status

> make[3]: *** [mca_io_romio.la] Error 1

> 

> So I tried recompile OrangeFS by this:

> 

> export CFLAGS="-fPIC"

> ./configure --prefix=/usr/local/orangefs2 --with-kernel=/usr/src/kernels/
2.6.32-431.5.1.el6.x86_64 --with-openib=/usr --without-bmi-tcp

> make -j 32

> make -j 32 kmod

> make install

> make kmod_install

> 

> (there was errors with current->fsuid => current->cred->fsuid, in multiple
files. I hardcoded this in files, bad idea I know )

> Then compilation of OpenMPI works.

> 

> ./configure --prefix=/usr/local/openmpi_1.6.5_romio2 --with-io-romio-flags
='--with-pvfs2=/usr/local/orangefs2'

> make -j32

> make install

> 

> but when i created simple program which is using MPIIO, it failed when i 
run it:

> 

> mpirun -np 1 -host node18 mpiio 

> [node18:01696] mca: base: component_find: unable to open /usr/local/
openmpi_1.6.5_romio/lib/openmpi/mca_io_romio: /usr/local/openmpi_1.6.5_
romio/lib/openmpi/mca_io_romio.so: undefined symbol: ADIOI_PVFS2_IReadContig
(ignored)

> 

> Because I got message form OrangeFS mailing list about -fPIC errors, i 
tryed to recompile OrangeFS withou this flag and compile OpenMPI (static 
linked) by this: 

> 

> ./congure --prefix=/usr/local/openmpi_1.6.5_romio2 --with-io-romio-flags='
--with-pvfs2=/usr/local/orangefs2' --enable-static --disable-shared

> (...)

> make -j 32

> (...)

> CCLD otfmerge-mpi

> /root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.
o):(.data+0x60): undefined reference to `ADIOI_PVFS2_IReadContig'

> /root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.
o):(.data+0x68): undefined reference to `ADIOI_PVFS2_IWriteContig'

> collect2: ld returned 1 exit status

> make[10]: *** [otfmerge-mpi] Error 1

> (...)

> 

> Now I realy dont know, what is wrong. 

> Is there Anybody ho has OpenMPI working with OrangeFS?

> 

> Thanks for replies

> HanousekVít

> ___

> users mailing list

> us...@open-mpi.org

> http://www.open-mpi.org/mailman/listinfo.cgi/users



___

users mailing list

us...@open-mpi.org

http://www.open-mpi.org/mailman/listinfo.cgi/users;

Re: [OMPI users] Compiling Open MPI 1.7.4 using PGI 14.2 and Mellanox HCOLL enabled

2014-02-26 Thread Ralph Castain
Perhaps you could try the nightly 1.7.5 tarball? I believe some PGI fixes may 
have gone in there


On Feb 25, 2014, at 3:22 PM, Filippo Spiga  wrote:

> Dear all,
> 
> I came across another small issue while I was compiling Open MPI 1.7.4 using 
> PGI 14.2 and building the support for Mellanox Hierarchical Collectives 
> (--with-hcoll). Here you how configure Open MPI:
> 
> export MXM_DIR=/opt/mellanox/mxm
> export KNEM_DIR=$(find /opt -maxdepth 1 -type d -name "knem*" -print0)
> export FCA_DIR=/opt/mellanox/fca
> export HCOLL_DIR=/opt/mellanox/hcoll
> 
> ../configure  CC=pgcc CXX=pgCC FC=pgf90 F90=pgf90 
> --prefix=/usr/local/Cluster-Users/fs395/openmpi-1.7.4/pgi-14.2_cuda-6.0RC  
> --enable-mpirun-prefix-by-default --with-hcoll=$HCOLL_DIR --with-fca=$FCA_DIR 
> --with-mxm=$MXM_DIR --with-knem=$KNEM_DIR 
> --with-slurm=/usr/local/Cluster-Apps/slurm  --with-cuda=$CUDA_INSTALL_PATH
> 
> 
> At some point the compile process fails with this error:
> 
> make[2]: Leaving directory 
> `/home/fs395/archive/openmpi-1.7.4/build/ompi/mca/coll/hierarch'
> Making all in mca/coll/hcoll
> make[2]: Entering directory 
> `/home/fs395/archive/openmpi-1.7.4/build/ompi/mca/coll/hcoll'
>  CC   coll_hcoll_module.lo
>  CC   coll_hcoll_component.lo
>  CC   coll_hcoll_rte.lo
>  CC   coll_hcoll_ops.lo
>  CCLD mca_coll_hcoll.la
> pgcc-Error-Unknown switch: -pthread
> make[2]: *** [mca_coll_hcoll.la] Error 1
> make[2]: Leaving directory 
> `/home/fs395/archive/openmpi-1.7.4/build/ompi/mca/coll/hcoll'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `/home/fs395/archive/openmpi-1.7.4/build/ompi'
> make: *** [all-recursive] Error 1
> 
> Attached the configure.log and the make.log collected as reported on the 
> website.  Using google I found an old post referring to the same problem. 
> Here few relevant links:
> http://www.open-mpi.org/community/lists/users/2009/03/8687.php
> http://www.open-mpi.org/community/lists/users/2010/09/14229.php
> http://www.open-mpi.org/community/lists/users/2009/04/8911.php
> 
> I have no problem to use a fake wrapper or the "-noswitcherror" compiler 
> pgf90 flag. I wonder if this procedure will affect in some way the MPI built 
> and I have to carry on this flag also when I compile my applications. 
> 
> Is there any way to fix libtool so Open MPI can build itself properly?
> 
> Thanks
> Filippo
> 
> --
> Mr. Filippo SPIGA, M.Sc.
> http://www.linkedin.com/in/filippospiga ~ skype: filippo.spiga
> 
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
> 
> *
> Disclaimer: "Please note this message and any attachments are CONFIDENTIAL 
> and may be privileged or otherwise protected from disclosure. The contents 
> are not to be disclosed to anyone other than the addressee. Unauthorized 
> recipients are requested to preserve this confidentiality and to advise the 
> sender immediately of any error in transmission."
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread Ralph Castain
Edgar hasn't had a chance to find the necessary patch - he was on travel, 
returning soon.


On Feb 26, 2014, at 9:27 AM, vithanousek  wrote:

> Hello,
> 
> I have still problems with compiling OpenMPI 1.6.5 with OrangeFS 2.8.7 
> support.
> 
> I compiled OrangeFS by this:
> 
>  ./configure --prefix=/usr/local/orangefs2 
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64 --with-openib=/usr 
> --without-bmi-tcp
> make -j 32
> make -j 32 kmod
> make install
> make kmod_install
> 
> this works.
> than I tried to compile OpenMPI (with fixed convert_named function in 
> ad_pvfs2_io_dtype.c)  by this:
> 
> ./configure --prefix=/usr/local/openmpi_1.6.5_romio2 
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> (...)
> make -j32
> (...)
> CCLD mca_io_romio.la
> /usr/bin/ld: /usr/local/orangefs2/lib/libpvfs2.a(errno-mapping.o): relocation 
> R_X86_64_32S against `PINT_errno_mapping' can not be used when making a 
> shared object; recompile with -fPIC
> /usr/local/orangefs2/lib/libpvfs2.a: could not read symbols: Bad value
> collect2: ld returned 1 exit status
> make[3]: *** [mca_io_romio.la] Error 1
> 
> So I tried recompile OrangeFS by this:
> 
> export CFLAGS="-fPIC"
> ./configure --prefix=/usr/local/orangefs2 
> --with-kernel=/usr/src/kernels/2.6.32-431.5.1.el6.x86_64 --with-openib=/usr 
> --without-bmi-tcp
> make -j 32
> make -j 32 kmod
> make install
> make kmod_install
> 
> (there was errors with current->fsuid => current->cred->fsuid, in multiple 
> files. I hardcoded this in files, bad idea I know )
> Then compilation of OpenMPI works.
> 
> ./configure --prefix=/usr/local/openmpi_1.6.5_romio2 
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2'
> make -j32
> make install
> 
> but when i created simple program which is using MPIIO, it failed when i run 
> it:
> 
> mpirun -np 1 -host node18 mpiio 
> [node18:01696] mca: base: component_find: unable to open 
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio: 
> /usr/local/openmpi_1.6.5_romio/lib/openmpi/mca_io_romio.so: undefined symbol: 
> ADIOI_PVFS2_IReadContig (ignored)
> 
> Because I got message form OrangeFS mailing list about -fPIC errors, i tryed 
> to recompile OrangeFS withou this flag and compile OpenMPI (static linked) by 
> this:  
> 
> ./congure --prefix=/usr/local/openmpi_1.6.5_romio2 
> --with-io-romio-flags='--with-pvfs2=/usr/local/orangefs2' --enable-static 
> --disable-shared
> (...)
> make -j 32
> (...)
>   CCLD   otfmerge-mpi
> /root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.o):(.data+0x60):
>  undefined reference to `ADIOI_PVFS2_IReadContig'
> /root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.o):(.data+0x68):
>  undefined reference to `ADIOI_PVFS2_IWriteContig'
> collect2: ld returned 1 exit status
> make[10]: *** [otfmerge-mpi] Error 1
> (...)
> 
> Now I realy dont know, what is wrong. 
> Is there Anybody ho has OpenMPI working with OrangeFS?
> 
> Thanks for replies
> HanousekVít
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] OpenMPI-ROMIO-OrangeFS

2014-02-26 Thread vithanousek
Hello,

I have still problems with compiling OpenMPI 1.6.5 with OrangeFS 2.8.7 
support.

I compiled OrangeFS by this:

 ./configure --prefix=/usr/local/orangefs2 --with-kernel=/usr/src/kernels/
2.6.32-431.5.1.el6.x86_64 --with-openib=/usr --without-bmi-tcp
make -j 32
make -j 32 kmod
make install
make kmod_install

this works.
than I tried to compile OpenMPI (with fixed convert_named function in ad_
pvfs2_io_dtype.c)  by this:

./configure --prefix=/usr/local/openmpi_1.6.5_romio2 --with-io-romio-flags='
--with-pvfs2=/usr/local/orangefs2'
(...)
make -j32
(...)
CCLD mca_io_romio.la
/usr/bin/ld: /usr/local/orangefs2/lib/libpvfs2.a(errno-mapping.o): 
relocation R_X86_64_32S against `PINT_errno_mapping' can not be used when 
making a shared object; recompile with -fPIC
/usr/local/orangefs2/lib/libpvfs2.a: could not read symbols: Bad value
collect2: ld returned 1 exit status
make[3]: *** [mca_io_romio.la] Error 1

So I tried recompile OrangeFS by this:

export CFLAGS="-fPIC"
./configure --prefix=/usr/local/orangefs2 --with-kernel=/usr/src/kernels/
2.6.32-431.5.1.el6.x86_64 --with-openib=/usr --without-bmi-tcp
make -j 32
make -j 32 kmod
make install
make kmod_install

(there was errors with current->fsuid => current->cred->fsuid, in multiple 
files. I hardcoded this in files, bad idea I know )
Then compilation of OpenMPI works.

./configure --prefix=/usr/local/openmpi_1.6.5_romio2 --with-io-romio-flags='
--with-pvfs2=/usr/local/orangefs2'
make -j32
make install

but when i created simple program which is using MPIIO, it failed when i run
it:

mpirun -np 1 -host node18 mpiio 
[node18:01696] mca: base: component_find: unable to open /usr/local/openmpi_
1.6.5_romio/lib/openmpi/mca_io_romio: /usr/local/openmpi_1.6.5_romio/lib/
openmpi/mca_io_romio.so: undefined symbol: ADIOI_PVFS2_IReadContig (ignored)

Because I got message form OrangeFS mailing list about -fPIC errors, i tryed
to recompile OrangeFS withou this flag and compile OpenMPI (static linked) 
by this:  

./congure --prefix=/usr/local/openmpi_1.6.5_romio2 --with-io-romio-flags='--
with-pvfs2=/usr/local/orangefs2' --enable-static --disable-shared
(...)
make -j 32
(...)
  CCLD   otfmerge-mpi
/root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.o):(
.data+0x60): undefined reference to `ADIOI_PVFS2_IReadContig'
/root/openmpi-1.6.5/ompi/contrib/vt/vt/../../../.libs/libmpi.a(ad_pvfs2.o):(
.data+0x68): undefined reference to `ADIOI_PVFS2_IWriteContig'
collect2: ld returned 1 exit status
make[10]: *** [otfmerge-mpi] Error 1
(...)

Now I realy dont know, what is wrong. 
Is there Anybody ho has OpenMPI working with OrangeFS?

Thanks for replies
HanousekVít


Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
Thank you Ralph, I'll check this.


On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:

> It means that OMPI didn't get built against libnuma, and so we can't
> ensure that memory is being bound local to the proc binding. Check to see
> if numactl and numactl-devel are installed, or you can turn off the warning
> using "-mca hwloc_base_mem_bind_failure_action silent"
>
>
> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
>
> Hi,
>
> I tried to run an MPI Java program with --bind-to core. I receive the
> following warning and wonder how to fix this.
>
>
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
>
>   Node:  192.168.0.19
>
> This is a warning only; your job will continue, though performance may
> be degraded.
>
>
> Thank you,
> Saliya
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>  ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org


Re: [OMPI users] slowdown with infiniband and latest CentOS kernel

2014-02-26 Thread Bernd Dammann

Hi,

I found this thread from before Christmas, and I wondered what the 
status of this problem is.  We experience the same problems since our 
upgrade to Scientific Linux 6.4, kernel 2.6.32-431.1.2.el6.x86_64, and 
OpenMPI 1.6.5.


Users have reported severe slowdowns in all kinds of applications, like 
VASP, OpenFOAM, etc.


Using the workaround '--bind-to-core' does only make sense for those 
jobs, that allocate full nodes, but the majority of our jobs don't do that.


Is there any news on this issue?

Regards,
Bernd

--
DTU Computing Center
Technical University of Denmark


Re: [OMPI users] OpenMPI 1.7.5 and "--map-by" new syntax

2014-02-26 Thread Ralph Castain
My bad - I'll fix the help message. Thanks!

On Feb 26, 2014, at 6:42 AM, Filippo Spiga  wrote:

> Yes it works. Information provided by mpirun is confusing but I get the right 
> syntax now. Thank you!
> 
> F
> 
> 
> 
> On Feb 26, 2014, at 12:34 PM, tmish...@jcity.maeda.co.jp wrote:
>> Hi, this help message might be just a simple mistake.
>> 
>> Please try: mpirun -np 20 --map-by ppr:5:socket -bind-to core osu_alltoall
>> 
>> There's no available explanation yet as far as I know, because it's still
>> alfa version.
>> 
>> Tetsuya Mishima
>> 
>>> Dear all,
>>> 
>>> I am playing with Open MPI 1.7.5 and with the "--map-by" option but I am
>> not sure I am doing thing correctly despite I am following the instruction.
>> Here what I got
>>> 
>>> $mpirun -np 20 --npersocket 5 -bind-to core osu_alltoall
>>> 
>> --
>>> The following command line options and corresponding MCA parameter have
>>> been deprecated and replaced as follows:
>>> 
>>> Command line options:
>>> Deprecated:  --npersocket, -npersocket
>>> Replacement: --map-by socket:PPR=N
>>> 
>>> Equivalent MCA parameter:
>>> Deprecated:  rmaps_base_n_persocket, rmaps_ppr_n_persocket
>>> Replacement: rmaps_base_mapping_policy=socket:PPR=N
>>> 
>>> The deprecated forms *will* disappear in a future version of Open MPI.
>>> Please update to the new syntax.
>>> 
>> --
>>> 
>>> 
>>> after changing according to the instructions I see
>>> 
>>> $ mpirun -np 24 --map-by socket:PPR=5 -bind-to core osu_alltoall
>>> 
>>> 
>> --
>>> The mapping request contains an unrecognized modifier:
>>> 
>>> Request: socket:PPR=5
>>> 
>>> Please check your request and try again.
>>> 
>> --
>>> [tesla49:30459] [[29390,0],0] ORTE_ERROR_LOG: Bad parameter in file
>> ess_hnp_module.c at line 510
>>> 
>> --
>>> It looks like orte_init failed for some reason; your parallel process is
>>> likely to abort.  There are many reasons that a parallel process can
>>> fail during orte_init; some of which are due to configuration or
>>> environment problems.  This failure appears to be an internal failure;
>>> here's some additional information (which may only be relevant to an
>>> Open MPI developer):
>>> 
>>> orte_rmaps_base_open failed
>>> --> Returned value Bad parameter (-5) instead of ORTE_SUCCESS
>>> 
>> --
>>> 
>>> 
>>> 
>>> Is there any place where the new syntax is explained?
>>> 
>>> Thanks in advance
>>> F
>>> 
>>> --
>>> Mr. Filippo SPIGA, M.Sc. - HPC  Application Specialist
>>> High Performance Computing Service, University of Cambridge (UK)
>>> http://www.hpc.cam.ac.uk/ ~ http://filippospiga.me ~ skype: filippo.spiga
>>> 
>>> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>>> 
>>> *
>>> Disclaimer: "Please note this message and any attachments are
>> CONFIDENTIAL and may be privileged or otherwise protected from disclosure.
>> The contents are not to be disclosed to anyone other than the
>>> addressee. Unauthorized recipients are requested to preserve this
>> confidentiality and to advise the sender immediately of any error in
>> transmission."
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> --
> Mr. Filippo SPIGA, M.Sc.
> http://www.linkedin.com/in/filippospiga ~ skype: filippo.spiga
> 
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
> 
> *
> Disclaimer: "Please note this message and any attachments are CONFIDENTIAL 
> and may be privileged or otherwise protected from disclosure. The contents 
> are not to be disclosed to anyone other than the addressee. Unauthorized 
> recipients are requested to preserve this confidentiality and to advise the 
> sender immediately of any error in transmission."
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain
It means that OMPI didn't get built against libnuma, and so we can't ensure 
that memory is being bound local to the proc binding. Check to see if numactl 
and numactl-devel are installed, or you can turn off the warning using "-mca 
hwloc_base_mem_bind_failure_action silent"


On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:

> Hi,
> 
> I tried to run an MPI Java program with --bind-to core. I receive the 
> following warning and wonder how to fix this.
> 
> 
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
> 
>   Node:  192.168.0.19
> 
> This is a warning only; your job will continue, though performance may
> be degraded.
> 
> 
> Thank you,
> Saliya
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] OpenMPI 1.7.5 and "--map-by" new syntax

2014-02-26 Thread Filippo Spiga
Yes it works. Information provided by mpirun is confusing but I get the right 
syntax now. Thank you!

F

 

On Feb 26, 2014, at 12:34 PM, tmish...@jcity.maeda.co.jp wrote:
> Hi, this help message might be just a simple mistake.
> 
> Please try: mpirun -np 20 --map-by ppr:5:socket -bind-to core osu_alltoall
> 
> There's no available explanation yet as far as I know, because it's still
> alfa version.
> 
> Tetsuya Mishima
> 
>> Dear all,
>> 
>> I am playing with Open MPI 1.7.5 and with the "--map-by" option but I am
> not sure I am doing thing correctly despite I am following the instruction.
> Here what I got
>> 
>> $mpirun -np 20 --npersocket 5 -bind-to core osu_alltoall
>> 
> --
>> The following command line options and corresponding MCA parameter have
>> been deprecated and replaced as follows:
>> 
>> Command line options:
>> Deprecated:  --npersocket, -npersocket
>> Replacement: --map-by socket:PPR=N
>> 
>> Equivalent MCA parameter:
>> Deprecated:  rmaps_base_n_persocket, rmaps_ppr_n_persocket
>> Replacement: rmaps_base_mapping_policy=socket:PPR=N
>> 
>> The deprecated forms *will* disappear in a future version of Open MPI.
>> Please update to the new syntax.
>> 
> --
>> 
>> 
>> after changing according to the instructions I see
>> 
>> $ mpirun -np 24 --map-by socket:PPR=5 -bind-to core osu_alltoall
>> 
>> 
> --
>> The mapping request contains an unrecognized modifier:
>> 
>> Request: socket:PPR=5
>> 
>> Please check your request and try again.
>> 
> --
>> [tesla49:30459] [[29390,0],0] ORTE_ERROR_LOG: Bad parameter in file
> ess_hnp_module.c at line 510
>> 
> --
>> It looks like orte_init failed for some reason; your parallel process is
>> likely to abort.  There are many reasons that a parallel process can
>> fail during orte_init; some of which are due to configuration or
>> environment problems.  This failure appears to be an internal failure;
>> here's some additional information (which may only be relevant to an
>> Open MPI developer):
>> 
>> orte_rmaps_base_open failed
>> --> Returned value Bad parameter (-5) instead of ORTE_SUCCESS
>> 
> --
>> 
>> 
>> 
>> Is there any place where the new syntax is explained?
>> 
>> Thanks in advance
>> F
>> 
>> --
>> Mr. Filippo SPIGA, M.Sc. - HPC  Application Specialist
>> High Performance Computing Service, University of Cambridge (UK)
>> http://www.hpc.cam.ac.uk/ ~ http://filippospiga.me ~ skype: filippo.spiga
>> 
>> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>> 
>> *
>> Disclaimer: "Please note this message and any attachments are
> CONFIDENTIAL and may be privileged or otherwise protected from disclosure.
> The contents are not to be disclosed to anyone other than the
>> addressee. Unauthorized recipients are requested to preserve this
> confidentiality and to advise the sender immediately of any error in
> transmission."
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users

--
Mr. Filippo SPIGA, M.Sc.
http://www.linkedin.com/in/filippospiga ~ skype: filippo.spiga

«Nobody will drive us out of Cantor's paradise.» ~ David Hilbert

*
Disclaimer: "Please note this message and any attachments are CONFIDENTIAL and 
may be privileged or otherwise protected from disclosure. The contents are not 
to be disclosed to anyone other than the addressee. Unauthorized recipients are 
requested to preserve this confidentiality and to advise the sender immediately 
of any error in transmission."




Re: [OMPI users] OpenMPI 1.7.5 and "--map-by" new syntax

2014-02-26 Thread tmishima


Hi, this help message might be just a simple mistake.

Please try: mpirun -np 20 --map-by ppr:5:socket -bind-to core osu_alltoall

There's no available explanation yet as far as I know, because it's still
alfa version.

Tetsuya Mishima

> Dear all,
>
> I am playing with Open MPI 1.7.5 and with the "--map-by" option but I am
not sure I am doing thing correctly despite I am following the instruction.
Here what I got
>
> $mpirun -np 20 --npersocket 5 -bind-to core osu_alltoall
>
--
> The following command line options and corresponding MCA parameter have
> been deprecated and replaced as follows:
>
> Command line options:
> Deprecated:  --npersocket, -npersocket
> Replacement: --map-by socket:PPR=N
>
> Equivalent MCA parameter:
> Deprecated:  rmaps_base_n_persocket, rmaps_ppr_n_persocket
> Replacement: rmaps_base_mapping_policy=socket:PPR=N
>
> The deprecated forms *will* disappear in a future version of Open MPI.
> Please update to the new syntax.
>
--
>
>
> after changing according to the instructions I see
>
> $ mpirun -np 24 --map-by socket:PPR=5 -bind-to core osu_alltoall
>
>
--
> The mapping request contains an unrecognized modifier:
>
> Request: socket:PPR=5
>
> Please check your request and try again.
>
--
> [tesla49:30459] [[29390,0],0] ORTE_ERROR_LOG: Bad parameter in file
ess_hnp_module.c at line 510
>
--
> It looks like orte_init failed for some reason; your parallel process is
> likely to abort.  There are many reasons that a parallel process can
> fail during orte_init; some of which are due to configuration or
> environment problems.  This failure appears to be an internal failure;
> here's some additional information (which may only be relevant to an
> Open MPI developer):
>
> orte_rmaps_base_open failed
> --> Returned value Bad parameter (-5) instead of ORTE_SUCCESS
>
--
>
>
>
> Is there any place where the new syntax is explained?
>
> Thanks in advance
> F
>
> --
> Mr. Filippo SPIGA, M.Sc. - HPC  Application Specialist
> High Performance Computing Service, University of Cambridge (UK)
> http://www.hpc.cam.ac.uk/ ~ http://filippospiga.me ~ skype: filippo.spiga
>
> «Nobody will drive us out of Cantor's paradise.» ~ David Hilbert
>
> *
> Disclaimer: "Please note this message and any attachments are
CONFIDENTIAL and may be privileged or otherwise protected from disclosure.
The contents are not to be disclosed to anyone other than the
> addressee. Unauthorized recipients are requested to preserve this
confidentiality and to advise the sender immediately of any error in
transmission."
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] OpenMPI 1.7.5 and "--map-by" new syntax

2014-02-26 Thread Filippo Spiga
Dear all,

I am playing with Open MPI 1.7.5 and with the "--map-by" option but I am not 
sure I am doing thing correctly despite I am following the instruction. Here 
what I got

$mpirun -np 20 --npersocket 5 -bind-to core osu_alltoall 
--
The following command line options and corresponding MCA parameter have
been deprecated and replaced as follows:

  Command line options:
Deprecated:  --npersocket, -npersocket
Replacement: --map-by socket:PPR=N

  Equivalent MCA parameter:
Deprecated:  rmaps_base_n_persocket, rmaps_ppr_n_persocket
Replacement: rmaps_base_mapping_policy=socket:PPR=N

The deprecated forms *will* disappear in a future version of Open MPI.
Please update to the new syntax.
--


after changing according to the instructions I see

$ mpirun -np 24 --map-by socket:PPR=5 -bind-to core osu_alltoall

--
The mapping request contains an unrecognized modifier:

  Request: socket:PPR=5

Please check your request and try again.
--
[tesla49:30459] [[29390,0],0] ORTE_ERROR_LOG: Bad parameter in file 
ess_hnp_module.c at line 510
--
It looks like orte_init failed for some reason; your parallel process is
likely to abort.  There are many reasons that a parallel process can
fail during orte_init; some of which are due to configuration or
environment problems.  This failure appears to be an internal failure;
here's some additional information (which may only be relevant to an
Open MPI developer):

  orte_rmaps_base_open failed
  --> Returned value Bad parameter (-5) instead of ORTE_SUCCESS
--



Is there any place where the new syntax is explained? 

Thanks in advance
F

--
Mr. Filippo SPIGA, M.Sc. - HPC  Application Specialist
High Performance Computing Service, University of Cambridge (UK)
http://www.hpc.cam.ac.uk/ ~ http://filippospiga.me ~ skype: filippo.spiga

«Nobody will drive us out of Cantor's paradise.» ~ David Hilbert

*
Disclaimer: "Please note this message and any attachments are CONFIDENTIAL and 
may be privileged or otherwise protected from disclosure. The contents are not 
to be disclosed to anyone other than the addressee. Unauthorized recipients are 
requested to preserve this confidentiality and to advise the sender immediately 
of any error in transmission."


Re: [OMPI users] run a program

2014-02-26 Thread jody
Hi Raha
Yes, that is correct.
You have to make sure that max-slots is less or equal to the number of cpus
in the node to avoid oversubscribing it.

Have a look at the other entries in the FAQ,  they give information on many
other options you can use.
   http://www.open-mpi.org/faq/?category=running

Jody


On Wed, Feb 26, 2014 at 10:38 AM, raha khalili wrote:

> Dear Jody
>
> Thank you for your reply. Based on hostfile examples you show me, I
> understand 'slots' is number of cpus of each node I mentioned in the file,
> am I true?
>
> Wishes
>
>
> On Wed, Feb 26, 2014 at 1:02 PM, jody  wrote:
>
>> Hi
>> I think you should use the "--host" or "--hostfile" options:
>>   http://www.open-mpi.org/faq/?category=running#simple-spmd-run
>>   http://www.open-mpi.org/faq/?category=running#mpirun-host
>> Hope this helps
>>   Jody
>>
>>
>> On Wed, Feb 26, 2014 at 8:31 AM, raha khalili 
>> wrote:
>>
>>>  Dear Users
>>>
>>> This is my first post in open-mpi forum and I am beginner in using mpi.
>>> I want to run a program which does between 4 systems consist of one
>>> server and three nodes with 20 cpus. When I run: *mpirun -np 20
>>> /home/khalili/espresso-5.0.2/bin/pw.x -in si.in  | tee 
>>> si.out*, after writing htop from terminal, it seems the program doesn't use 
>>> cpus
>>> of three other nodes and just use the cpus of server. Could you tell me
>>> please how do I can use all my cpus.
>>>
>>> Regards
>>> --
>>> Khadije Khalili
>>> Ph.D Student of Solid-State Physics
>>> Department of Physics
>>> University of Mazandaran
>>> Babolsar, Iran
>>> kh.khal...@stu.umz.ac.ir
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Khadije Khalili
> Ph.D Student of Solid-State Physics
> Department of Physics
> University of Mazandaran
> Babolsar, Iran
> kh.khal...@stu.umz.ac.ir
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] run a program

2014-02-26 Thread raha khalili
Dear Jody

Thank you for your reply. Based on hostfile examples you show me, I
understand 'slots' is number of cpus of each node I mentioned in the file,
am I true?

Wishes


On Wed, Feb 26, 2014 at 1:02 PM, jody  wrote:

> Hi
> I think you should use the "--host" or "--hostfile" options:
>   http://www.open-mpi.org/faq/?category=running#simple-spmd-run
>   http://www.open-mpi.org/faq/?category=running#mpirun-host
> Hope this helps
>   Jody
>
>
> On Wed, Feb 26, 2014 at 8:31 AM, raha khalili 
> wrote:
>
>> Dear Users
>>
>> This is my first post in open-mpi forum and I am beginner in using mpi.
>> I want to run a program which does between 4 systems consist of one
>> server and three nodes with 20 cpus. When I run: *mpirun -np 20
>> /home/khalili/espresso-5.0.2/bin/pw.x -in si.in  | tee 
>> si.out*, after writing htop from terminal, it seems the program doesn't use 
>> cpus
>> of three other nodes and just use the cpus of server. Could you tell me
>> please how do I can use all my cpus.
>>
>> Regards
>> --
>> Khadije Khalili
>> Ph.D Student of Solid-State Physics
>> Department of Physics
>> University of Mazandaran
>> Babolsar, Iran
>> kh.khal...@stu.umz.ac.ir
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Khadije Khalili
Ph.D Student of Solid-State Physics
Department of Physics
University of Mazandaran
Babolsar, Iran
kh.khal...@stu.umz.ac.ir


Re: [OMPI users] run a program

2014-02-26 Thread raha khalili
Dear  John Hearns

Thank you for your prompt reply. Could you send me a hostfile sample,
please? and a sample command that I must use for my program based on my
last post?

Wishes


On Wed, Feb 26, 2014 at 12:49 PM, John Hearns wrote:

> Khadije - you need to give a list of compute hosts to mpirun.
> And probably have to set up passwordless ssh to each host.
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Khadije Khalili
Ph.D Student of Solid-State Physics
Department of Physics
University of Mazandaran
Babolsar, Iran
kh.khal...@stu.umz.ac.ir


Re: [OMPI users] run a program

2014-02-26 Thread jody
Hi
I think you should use the "--host" or "--hostfile" options:
  http://www.open-mpi.org/faq/?category=running#simple-spmd-run
  http://www.open-mpi.org/faq/?category=running#mpirun-host
Hope this helps
  Jody


On Wed, Feb 26, 2014 at 8:31 AM, raha khalili wrote:

> Dear Users
>
> This is my first post in open-mpi forum and I am beginner in using mpi.
> I want to run a program which does between 4 systems consist of one server
> and three nodes with 20 cpus. When I run: *mpirun -np 20
> /home/khalili/espresso-5.0.2/bin/pw.x -in si.in  | tee si.out*, 
> after writing htop from terminal, it seems the program doesn't use cpus
> of three other nodes and just use the cpus of server. Could you tell me
> please how do I can use all my cpus.
>
> Regards
> --
> Khadije Khalili
> Ph.D Student of Solid-State Physics
> Department of Physics
> University of Mazandaran
> Babolsar, Iran
> kh.khal...@stu.umz.ac.ir
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>


Re: [OMPI users] run a program

2014-02-26 Thread John Hearns
Khadije - you need to give a list of compute hosts to mpirun.
And probably have to set up passwordless ssh to each host.


[OMPI users] run a program

2014-02-26 Thread raha khalili
Dear Users

This is my first post in open-mpi forum and I am beginner in using mpi.
I want to run a program which does between 4 systems consist of one server
and three nodes with 20 cpus. When I run: *mpirun -np 20
/home/khalili/espresso-5.0.2/bin/pw.x -in si.in  | tee
si.out*, after writing htop from terminal, it seems the program
doesn't use cpus
of three other nodes and just use the cpus of server. Could you tell me
please how do I can use all my cpus.

Regards
-- 
Khadije Khalili
Ph.D Student of Solid-State Physics
Department of Physics
University of Mazandaran
Babolsar, Iran
kh.khal...@stu.umz.ac.ir


[OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
Hi,

I tried to run an MPI Java program with --bind-to core. I receive the
following warning and wonder how to fix this.


WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.

  Node:  192.168.0.19

This is a warning only; your job will continue, though performance may
be degraded.


Thank you,
Saliya

-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org