Re: [OMPI users] Binding to Core Warning

2014-02-27 Thread Saliya Ekanayake
Thank you. Anyway, your email contains good amount of info.

Saliya


On Wed, Feb 26, 2014 at 7:48 PM, Ralph Castain  wrote:

> I did one "chapter" of it on Jeff's blog and probably should complete it.
> Definitely need to update the FAQ for the new options.
>
> Sadly, outside of that and the mpirun man page, there isn't much available
> yet. I'm woefully far behind on it.
>
>
> On Feb 26, 2014, at 4:47 PM, Saliya Ekanayake  wrote:
>
> Thank you Ralph, this is very insightful and I think I can better
> understand performance of our application.
>
> If I may ask, is there a document describing this affinity options? I've
> been looking at tuning FAQ and Jeff's blog posts.
>
> Thank you,
> Saliya
>
>
> On Wed, Feb 26, 2014 at 7:34 PM, Ralph Castain  wrote:
>
>>
>> On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake  wrote:
>>
>> I see, so if I understand correctly, the best scenario for threads would
>> be to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in
>> each proc.
>>
>>
>> Yes, that would be the best solution. If you have 4 cores in each socket,
>> then just bind each proc to the socket:
>>
>> --map-by socket --bind-to socket
>>
>> If you want to put one proc on each socket by itself, then do
>>
>> --map-by ppr:1:socket --bind-to socket
>>
>>
>>
>> Also, as you've mentioned binding threads to get memory locality, I guess
>> this has to be done at application level and not an option in OMPI
>>
>>
>> Sadly yes - the problem is that MPI lacks an init call for each thread,
>> and so we don't see the threads being started. You can use hwloc to bind
>> each thread, but it has to be done in the app itself.
>>
>>
>> Thank you,
>> Saliya
>>
>>
>> On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain  wrote:
>>
>>> Sorry, had to run some errands.
>>>
>>> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:
>>>
>>> Is it possible to bind to cores of multiple sockets? Say I have a
>>> machine with 2 sockets each with 4 cores and if I run 8 threads with 1 proc
>>> can I utilize all 8 cores for 8 threads?
>>>
>>>
>>> In that scenario, you won't get any benefit from binding as we only bind
>>> at the proc level (and binding to the entire node does nothing). You might
>>> want to bind your threads, however, as otherwise the threads will not
>>> necessarily execute local to any memory they malloc.
>>>
>>>
>>> Thank you for speedy replies
>>>
>>> Saliya
>>>
>>>
>>> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
>>>

 On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake 
 wrote:

 I have a followup question on this. In our application we have parallel
 for loops similar to OMP parallel for. I noticed that in order to gain
 speedup with threads I've to set --bind-to none, otherwise multiple threads
 will bind to same core giving no increase in performance. For example, I
 get following (attached) performance for a simple 3point stencil
 computation run with T threads on 1 MPI process on 1 node (Tx1x1).

 My understanding is even when there are multiple procs per node we
 should use --bind-to none in order to get performance with threads. Is this
 correct? Also, what's the disadvantage of not using --bind-to core?


 Your best performance with threads comes when you bind each process to
 multiple cores. Binding helps performance by ensuring your memory is always
 local, and provides some optimized scheduling benefits. You can bind to
 multiple cores by adding the qualifier "pe=N" to your mapping definition,
 like this:

 mpirun --map-by socket:pe=4 

 The above example will map processes by socket, and bind each process
 to 4 cores.

 HTH
 Ralph


 Thank you,
 Saliya


 On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake 
 wrote:

> Thank you Ralph, I'll check this.
>
>
> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote:
>
>> It means that OMPI didn't get built against libnuma, and so we can't
>> ensure that memory is being bound local to the proc binding. Check to see
>> if numactl and numactl-devel are installed, or you can turn off the 
>> warning
>> using "-mca hwloc_base_mem_bind_failure_action silent"
>>
>>
>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake 
>> wrote:
>>
>> Hi,
>>
>> I tried to run an MPI Java program with --bind-to core. I receive the
>> following warning and wonder how to fix this.
>>
>>
>> WARNING: a request was made to bind a process. While the system
>> supports binding the process itself, at least one node does NOT
>> support binding memory to the process location.
>>
>>   Node:  192.168.0.19
>>
>> This is a 

Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain
I did one "chapter" of it on Jeff's blog and probably should complete it. 
Definitely need to update the FAQ for the new options.

Sadly, outside of that and the mpirun man page, there isn't much available yet. 
I'm woefully far behind on it.


On Feb 26, 2014, at 4:47 PM, Saliya Ekanayake  wrote:

> Thank you Ralph, this is very insightful and I think I can better understand 
> performance of our application. 
> 
> If I may ask, is there a document describing this affinity options? I've been 
> looking at tuning FAQ and Jeff's blog posts.
> 
> Thank you,
> Saliya
> 
> 
> On Wed, Feb 26, 2014 at 7:34 PM, Ralph Castain  wrote:
> 
> On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake  wrote:
> 
>> I see, so if I understand correctly, the best scenario for threads would be 
>> to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in each 
>> proc. 
> 
> Yes, that would be the best solution. If you have 4 cores in each socket, 
> then just bind each proc to the socket:
> 
> --map-by socket --bind-to socket
> 
> If you want to put one proc on each socket by itself, then do
> 
> --map-by ppr:1:socket --bind-to socket
> 
> 
>> 
>> Also, as you've mentioned binding threads to get memory locality, I guess 
>> this has to be done at application level and not an option in OMPI
> 
> Sadly yes - the problem is that MPI lacks an init call for each thread, and 
> so we don't see the threads being started. You can use hwloc to bind each 
> thread, but it has to be done in the app itself.
> 
>> 
>> Thank you,
>> Saliya
>> 
>> 
>> On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain  wrote:
>> Sorry, had to run some errands.
>> 
>> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:
>> 
>>> Is it possible to bind to cores of multiple sockets? Say I have a machine 
>>> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I 
>>> utilize all 8 cores for 8 threads?
>> 
>> In that scenario, you won't get any benefit from binding as we only bind at 
>> the proc level (and binding to the entire node does nothing). You might want 
>> to bind your threads, however, as otherwise the threads will not necessarily 
>> execute local to any memory they malloc.
>> 
>>> 
>>> Thank you for speedy replies
>>> 
>>> Saliya
>>> 
>>> 
>>> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
>>> 
>>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
>>> 
 I have a followup question on this. In our application we have parallel 
 for loops similar to OMP parallel for. I noticed that in order to gain 
 speedup with threads I've to set --bind-to none, otherwise multiple 
 threads will bind to same core giving no increase in performance. For 
 example, I get following (attached) performance for a simple 3point 
 stencil computation run with T threads on 1 MPI process on 1 node (Tx1x1). 
 
 My understanding is even when there are multiple procs per node we should 
 use --bind-to none in order to get performance with threads. Is this 
 correct? Also, what's the disadvantage of not using --bind-to core?
>>> 
>>> Your best performance with threads comes when you bind each process to 
>>> multiple cores. Binding helps performance by ensuring your memory is always 
>>> local, and provides some optimized scheduling benefits. You can bind to 
>>> multiple cores by adding the qualifier "pe=N" to your mapping definition, 
>>> like this:
>>> 
>>> mpirun --map-by socket:pe=4 
>>> 
>>> The above example will map processes by socket, and bind each process to 4 
>>> cores.
>>> 
>>> HTH
>>> Ralph
>>> 
 
 Thank you,
 Saliya
 
 
 On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake  
 wrote:
 Thank you Ralph, I'll check this.
 
 
 On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
 It means that OMPI didn't get built against libnuma, and so we can't 
 ensure that memory is being bound local to the proc binding. Check to see 
 if numactl and numactl-devel are installed, or you can turn off the 
 warning using "-mca hwloc_base_mem_bind_failure_action silent"
 
 
 On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
 
> Hi,
> 
> I tried to run an MPI Java program with --bind-to core. I receive the 
> following warning and wonder how to fix this.
> 
> 
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
> 
>   Node:  192.168.0.19
> 
> This is a warning only; your job will continue, though performance may
> be degraded.
> 
> 
> Thank you,
> Saliya
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 

Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
Thank you Ralph, this is very insightful and I think I can better
understand performance of our application.

If I may ask, is there a document describing this affinity options? I've
been looking at tuning FAQ and Jeff's blog posts.

Thank you,
Saliya


On Wed, Feb 26, 2014 at 7:34 PM, Ralph Castain  wrote:

>
> On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake  wrote:
>
> I see, so if I understand correctly, the best scenario for threads would
> be to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in
> each proc.
>
>
> Yes, that would be the best solution. If you have 4 cores in each socket,
> then just bind each proc to the socket:
>
> --map-by socket --bind-to socket
>
> If you want to put one proc on each socket by itself, then do
>
> --map-by ppr:1:socket --bind-to socket
>
>
>
> Also, as you've mentioned binding threads to get memory locality, I guess
> this has to be done at application level and not an option in OMPI
>
>
> Sadly yes - the problem is that MPI lacks an init call for each thread,
> and so we don't see the threads being started. You can use hwloc to bind
> each thread, but it has to be done in the app itself.
>
>
> Thank you,
> Saliya
>
>
> On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain  wrote:
>
>> Sorry, had to run some errands.
>>
>> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:
>>
>> Is it possible to bind to cores of multiple sockets? Say I have a machine
>> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I
>> utilize all 8 cores for 8 threads?
>>
>>
>> In that scenario, you won't get any benefit from binding as we only bind
>> at the proc level (and binding to the entire node does nothing). You might
>> want to bind your threads, however, as otherwise the threads will not
>> necessarily execute local to any memory they malloc.
>>
>>
>> Thank you for speedy replies
>>
>> Saliya
>>
>>
>> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
>>
>>>
>>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake 
>>> wrote:
>>>
>>> I have a followup question on this. In our application we have parallel
>>> for loops similar to OMP parallel for. I noticed that in order to gain
>>> speedup with threads I've to set --bind-to none, otherwise multiple threads
>>> will bind to same core giving no increase in performance. For example, I
>>> get following (attached) performance for a simple 3point stencil
>>> computation run with T threads on 1 MPI process on 1 node (Tx1x1).
>>>
>>> My understanding is even when there are multiple procs per node we
>>> should use --bind-to none in order to get performance with threads. Is this
>>> correct? Also, what's the disadvantage of not using --bind-to core?
>>>
>>>
>>> Your best performance with threads comes when you bind each process to
>>> multiple cores. Binding helps performance by ensuring your memory is always
>>> local, and provides some optimized scheduling benefits. You can bind to
>>> multiple cores by adding the qualifier "pe=N" to your mapping definition,
>>> like this:
>>>
>>> mpirun --map-by socket:pe=4 
>>>
>>> The above example will map processes by socket, and bind each process to
>>> 4 cores.
>>>
>>> HTH
>>> Ralph
>>>
>>>
>>> Thank you,
>>> Saliya
>>>
>>>
>>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote:
>>>
 Thank you Ralph, I'll check this.


 On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote:

> It means that OMPI didn't get built against libnuma, and so we can't
> ensure that memory is being bound local to the proc binding. Check to see
> if numactl and numactl-devel are installed, or you can turn off the 
> warning
> using "-mca hwloc_base_mem_bind_failure_action silent"
>
>
> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake 
> wrote:
>
> Hi,
>
> I tried to run an MPI Java program with --bind-to core. I receive the
> following warning and wonder how to fix this.
>
>
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
>
>   Node:  192.168.0.19
>
> This is a warning only; your job will continue, though performance may
> be degraded.
>
>
> Thank you,
> Saliya
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>  ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



 --
 Saliya 

Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain

On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake  wrote:

> I see, so if I understand correctly, the best scenario for threads would be 
> to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in each 
> proc. 

Yes, that would be the best solution. If you have 4 cores in each socket, then 
just bind each proc to the socket:

--map-by socket --bind-to socket

If you want to put one proc on each socket by itself, then do

--map-by ppr:1:socket --bind-to socket


> 
> Also, as you've mentioned binding threads to get memory locality, I guess 
> this has to be done at application level and not an option in OMPI

Sadly yes - the problem is that MPI lacks an init call for each thread, and so 
we don't see the threads being started. You can use hwloc to bind each thread, 
but it has to be done in the app itself.

> 
> Thank you,
> Saliya
> 
> 
> On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain  wrote:
> Sorry, had to run some errands.
> 
> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:
> 
>> Is it possible to bind to cores of multiple sockets? Say I have a machine 
>> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I 
>> utilize all 8 cores for 8 threads?
> 
> In that scenario, you won't get any benefit from binding as we only bind at 
> the proc level (and binding to the entire node does nothing). You might want 
> to bind your threads, however, as otherwise the threads will not necessarily 
> execute local to any memory they malloc.
> 
>> 
>> Thank you for speedy replies
>> 
>> Saliya
>> 
>> 
>> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
>> 
>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
>> 
>>> I have a followup question on this. In our application we have parallel for 
>>> loops similar to OMP parallel for. I noticed that in order to gain speedup 
>>> with threads I've to set --bind-to none, otherwise multiple threads will 
>>> bind to same core giving no increase in performance. For example, I get 
>>> following (attached) performance for a simple 3point stencil computation 
>>> run with T threads on 1 MPI process on 1 node (Tx1x1). 
>>> 
>>> My understanding is even when there are multiple procs per node we should 
>>> use --bind-to none in order to get performance with threads. Is this 
>>> correct? Also, what's the disadvantage of not using --bind-to core?
>> 
>> Your best performance with threads comes when you bind each process to 
>> multiple cores. Binding helps performance by ensuring your memory is always 
>> local, and provides some optimized scheduling benefits. You can bind to 
>> multiple cores by adding the qualifier "pe=N" to your mapping definition, 
>> like this:
>> 
>> mpirun --map-by socket:pe=4 
>> 
>> The above example will map processes by socket, and bind each process to 4 
>> cores.
>> 
>> HTH
>> Ralph
>> 
>>> 
>>> Thank you,
>>> Saliya
>>> 
>>> 
>>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake  
>>> wrote:
>>> Thank you Ralph, I'll check this.
>>> 
>>> 
>>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
>>> It means that OMPI didn't get built against libnuma, and so we can't ensure 
>>> that memory is being bound local to the proc binding. Check to see if 
>>> numactl and numactl-devel are installed, or you can turn off the warning 
>>> using "-mca hwloc_base_mem_bind_failure_action silent"
>>> 
>>> 
>>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
>>> 
 Hi,
 
 I tried to run an MPI Java program with --bind-to core. I receive the 
 following warning and wonder how to fix this.
 
 
 WARNING: a request was made to bind a process. While the system
 supports binding the process itself, at least one node does NOT
 support binding memory to the process location.
 
   Node:  192.168.0.19
 
 This is a warning only; your job will continue, though performance may
 be degraded.
 
 
 Thank you,
 Saliya
 
 -- 
 Saliya Ekanayake esal...@gmail.com 
 Cell 812-391-4914 Home 812-961-6383
 http://saliya.org
 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>> 
>>> 
>>> 
>>> -- 
>>> Saliya Ekanayake esal...@gmail.com 
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>> 
>>> 
>>> 
>>> -- 
>>> Saliya Ekanayake esal...@gmail.com 
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>> <3pointstencil.png>___
>>> 
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 

Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
I see, so if I understand correctly, the best scenario for threads would be
to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in
each proc.

Also, as you've mentioned binding threads to get memory locality, I guess
this has to be done at application level and not an option in OMPI

Thank you,
Saliya


On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain  wrote:

> Sorry, had to run some errands.
>
> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:
>
> Is it possible to bind to cores of multiple sockets? Say I have a machine
> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I
> utilize all 8 cores for 8 threads?
>
>
> In that scenario, you won't get any benefit from binding as we only bind
> at the proc level (and binding to the entire node does nothing). You might
> want to bind your threads, however, as otherwise the threads will not
> necessarily execute local to any memory they malloc.
>
>
> Thank you for speedy replies
>
> Saliya
>
>
> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
>
>>
>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
>>
>> I have a followup question on this. In our application we have parallel
>> for loops similar to OMP parallel for. I noticed that in order to gain
>> speedup with threads I've to set --bind-to none, otherwise multiple threads
>> will bind to same core giving no increase in performance. For example, I
>> get following (attached) performance for a simple 3point stencil
>> computation run with T threads on 1 MPI process on 1 node (Tx1x1).
>>
>> My understanding is even when there are multiple procs per node we should
>> use --bind-to none in order to get performance with threads. Is this
>> correct? Also, what's the disadvantage of not using --bind-to core?
>>
>>
>> Your best performance with threads comes when you bind each process to
>> multiple cores. Binding helps performance by ensuring your memory is always
>> local, and provides some optimized scheduling benefits. You can bind to
>> multiple cores by adding the qualifier "pe=N" to your mapping definition,
>> like this:
>>
>> mpirun --map-by socket:pe=4 
>>
>> The above example will map processes by socket, and bind each process to
>> 4 cores.
>>
>> HTH
>> Ralph
>>
>>
>> Thank you,
>> Saliya
>>
>>
>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote:
>>
>>> Thank you Ralph, I'll check this.
>>>
>>>
>>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote:
>>>
 It means that OMPI didn't get built against libnuma, and so we can't
 ensure that memory is being bound local to the proc binding. Check to see
 if numactl and numactl-devel are installed, or you can turn off the warning
 using "-mca hwloc_base_mem_bind_failure_action silent"


 On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake 
 wrote:

 Hi,

 I tried to run an MPI Java program with --bind-to core. I receive the
 following warning and wonder how to fix this.


 WARNING: a request was made to bind a process. While the system
 supports binding the process itself, at least one node does NOT
 support binding memory to the process location.

   Node:  192.168.0.19

 This is a warning only; your job will continue, though performance may
 be degraded.


 Thank you,
 Saliya

 --
 Saliya Ekanayake esal...@gmail.com
 Cell 812-391-4914 Home 812-961-6383
 http://saliya.org
  ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users



 ___
 users mailing list
 us...@open-mpi.org
 http://www.open-mpi.org/mailman/listinfo.cgi/users

>>>
>>>
>>>
>>> --
>>> Saliya Ekanayake esal...@gmail.com
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>>
>>
>>
>>
>> --
>> Saliya Ekanayake esal...@gmail.com
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>>  <3pointstencil.png>___
>>
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>  ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383

Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain
Sorry, had to run some errands.

On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake  wrote:

> Is it possible to bind to cores of multiple sockets? Say I have a machine 
> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I 
> utilize all 8 cores for 8 threads?

In that scenario, you won't get any benefit from binding as we only bind at the 
proc level (and binding to the entire node does nothing). You might want to 
bind your threads, however, as otherwise the threads will not necessarily 
execute local to any memory they malloc.

> 
> Thank you for speedy replies
> 
> Saliya
> 
> 
> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:
> 
> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
> 
>> I have a followup question on this. In our application we have parallel for 
>> loops similar to OMP parallel for. I noticed that in order to gain speedup 
>> with threads I've to set --bind-to none, otherwise multiple threads will 
>> bind to same core giving no increase in performance. For example, I get 
>> following (attached) performance for a simple 3point stencil computation run 
>> with T threads on 1 MPI process on 1 node (Tx1x1). 
>> 
>> My understanding is even when there are multiple procs per node we should 
>> use --bind-to none in order to get performance with threads. Is this 
>> correct? Also, what's the disadvantage of not using --bind-to core?
> 
> Your best performance with threads comes when you bind each process to 
> multiple cores. Binding helps performance by ensuring your memory is always 
> local, and provides some optimized scheduling benefits. You can bind to 
> multiple cores by adding the qualifier "pe=N" to your mapping definition, 
> like this:
> 
> mpirun --map-by socket:pe=4 
> 
> The above example will map processes by socket, and bind each process to 4 
> cores.
> 
> HTH
> Ralph
> 
>> 
>> Thank you,
>> Saliya
>> 
>> 
>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake  wrote:
>> Thank you Ralph, I'll check this.
>> 
>> 
>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
>> It means that OMPI didn't get built against libnuma, and so we can't ensure 
>> that memory is being bound local to the proc binding. Check to see if 
>> numactl and numactl-devel are installed, or you can turn off the warning 
>> using "-mca hwloc_base_mem_bind_failure_action silent"
>> 
>> 
>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
>> 
>>> Hi,
>>> 
>>> I tried to run an MPI Java program with --bind-to core. I receive the 
>>> following warning and wonder how to fix this.
>>> 
>>> 
>>> WARNING: a request was made to bind a process. While the system
>>> supports binding the process itself, at least one node does NOT
>>> support binding memory to the process location.
>>> 
>>>   Node:  192.168.0.19
>>> 
>>> This is a warning only; your job will continue, though performance may
>>> be degraded.
>>> 
>>> 
>>> Thank you,
>>> Saliya
>>> 
>>> -- 
>>> Saliya Ekanayake esal...@gmail.com 
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>> 
>> 
>> 
>> -- 
>> Saliya Ekanayake esal...@gmail.com 
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>> 
>> 
>> 
>> -- 
>> Saliya Ekanayake esal...@gmail.com 
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>> <3pointstencil.png>___
>> 
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
Is it possible to bind to cores of multiple sockets? Say I have a machine
with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I
utilize all 8 cores for 8 threads?

Thank you for speedy replies

Saliya


On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain  wrote:

>
> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:
>
> I have a followup question on this. In our application we have parallel
> for loops similar to OMP parallel for. I noticed that in order to gain
> speedup with threads I've to set --bind-to none, otherwise multiple threads
> will bind to same core giving no increase in performance. For example, I
> get following (attached) performance for a simple 3point stencil
> computation run with T threads on 1 MPI process on 1 node (Tx1x1).
>
> My understanding is even when there are multiple procs per node we should
> use --bind-to none in order to get performance with threads. Is this
> correct? Also, what's the disadvantage of not using --bind-to core?
>
>
> Your best performance with threads comes when you bind each process to
> multiple cores. Binding helps performance by ensuring your memory is always
> local, and provides some optimized scheduling benefits. You can bind to
> multiple cores by adding the qualifier "pe=N" to your mapping definition,
> like this:
>
> mpirun --map-by socket:pe=4 
>
> The above example will map processes by socket, and bind each process to 4
> cores.
>
> HTH
> Ralph
>
>
> Thank you,
> Saliya
>
>
> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote:
>
>> Thank you Ralph, I'll check this.
>>
>>
>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
>>
>>> It means that OMPI didn't get built against libnuma, and so we can't
>>> ensure that memory is being bound local to the proc binding. Check to see
>>> if numactl and numactl-devel are installed, or you can turn off the warning
>>> using "-mca hwloc_base_mem_bind_failure_action silent"
>>>
>>>
>>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake 
>>> wrote:
>>>
>>> Hi,
>>>
>>> I tried to run an MPI Java program with --bind-to core. I receive the
>>> following warning and wonder how to fix this.
>>>
>>>
>>> WARNING: a request was made to bind a process. While the system
>>> supports binding the process itself, at least one node does NOT
>>> support binding memory to the process location.
>>>
>>>   Node:  192.168.0.19
>>>
>>> This is a warning only; your job will continue, though performance may
>>> be degraded.
>>>
>>>
>>> Thank you,
>>> Saliya
>>>
>>> --
>>> Saliya Ekanayake esal...@gmail.com
>>> Cell 812-391-4914 Home 812-961-6383
>>> http://saliya.org
>>>  ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>>
>>>
>>> ___
>>> users mailing list
>>> us...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>>
>>
>>
>>
>> --
>> Saliya Ekanayake esal...@gmail.com
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>>
>
>
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>  <3pointstencil.png>___
>
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org


Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain

On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake  wrote:

> I have a followup question on this. In our application we have parallel for 
> loops similar to OMP parallel for. I noticed that in order to gain speedup 
> with threads I've to set --bind-to none, otherwise multiple threads will bind 
> to same core giving no increase in performance. For example, I get following 
> (attached) performance for a simple 3point stencil computation run with T 
> threads on 1 MPI process on 1 node (Tx1x1). 
> 
> My understanding is even when there are multiple procs per node we should use 
> --bind-to none in order to get performance with threads. Is this correct? 
> Also, what's the disadvantage of not using --bind-to core?

Your best performance with threads comes when you bind each process to multiple 
cores. Binding helps performance by ensuring your memory is always local, and 
provides some optimized scheduling benefits. You can bind to multiple cores by 
adding the qualifier "pe=N" to your mapping definition, like this:

mpirun --map-by socket:pe=4 

The above example will map processes by socket, and bind each process to 4 
cores.

HTH
Ralph

> 
> Thank you,
> Saliya
> 
> 
> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake  wrote:
> Thank you Ralph, I'll check this.
> 
> 
> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
> It means that OMPI didn't get built against libnuma, and so we can't ensure 
> that memory is being bound local to the proc binding. Check to see if numactl 
> and numactl-devel are installed, or you can turn off the warning using "-mca 
> hwloc_base_mem_bind_failure_action silent"
> 
> 
> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
> 
>> Hi,
>> 
>> I tried to run an MPI Java program with --bind-to core. I receive the 
>> following warning and wonder how to fix this.
>> 
>> 
>> WARNING: a request was made to bind a process. While the system
>> supports binding the process itself, at least one node does NOT
>> support binding memory to the process location.
>> 
>>   Node:  192.168.0.19
>> 
>> This is a warning only; your job will continue, though performance may
>> be degraded.
>> 
>> 
>> Thank you,
>> Saliya
>> 
>> -- 
>> Saliya Ekanayake esal...@gmail.com 
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
> 
> 
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
> 
> 
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
> <3pointstencil.png>___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
I have a followup question on this. In our application we have parallel for
loops similar to OMP parallel for. I noticed that in order to gain speedup
with threads I've to set --bind-to none, otherwise multiple threads will
bind to same core giving no increase in performance. For example, I get
following (attached) performance for a simple 3point stencil computation
run with T threads on 1 MPI process on 1 node (Tx1x1).

My understanding is even when there are multiple procs per node we should
use --bind-to none in order to get performance with threads. Is this
correct? Also, what's the disadvantage of not using --bind-to core?

Thank you,
Saliya


On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote:

> Thank you Ralph, I'll check this.
>
>
> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:
>
>> It means that OMPI didn't get built against libnuma, and so we can't
>> ensure that memory is being bound local to the proc binding. Check to see
>> if numactl and numactl-devel are installed, or you can turn off the warning
>> using "-mca hwloc_base_mem_bind_failure_action silent"
>>
>>
>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
>>
>> Hi,
>>
>> I tried to run an MPI Java program with --bind-to core. I receive the
>> following warning and wonder how to fix this.
>>
>>
>> WARNING: a request was made to bind a process. While the system
>> supports binding the process itself, at least one node does NOT
>> support binding memory to the process location.
>>
>>   Node:  192.168.0.19
>>
>> This is a warning only; your job will continue, though performance may
>> be degraded.
>>
>>
>> Thank you,
>> Saliya
>>
>> --
>> Saliya Ekanayake esal...@gmail.com
>> Cell 812-391-4914 Home 812-961-6383
>> http://saliya.org
>>  ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>>
>>
>> ___
>> users mailing list
>> us...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/users
>>
>
>
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>



-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org


Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
Thank you Ralph, I'll check this.


On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain  wrote:

> It means that OMPI didn't get built against libnuma, and so we can't
> ensure that memory is being bound local to the proc binding. Check to see
> if numactl and numactl-devel are installed, or you can turn off the warning
> using "-mca hwloc_base_mem_bind_failure_action silent"
>
>
> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:
>
> Hi,
>
> I tried to run an MPI Java program with --bind-to core. I receive the
> following warning and wonder how to fix this.
>
>
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
>
>   Node:  192.168.0.19
>
> This is a warning only; your job will continue, though performance may
> be degraded.
>
>
> Thank you,
> Saliya
>
> --
> Saliya Ekanayake esal...@gmail.com
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
>  ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>
>
>
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users
>



-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org


Re: [OMPI users] Binding to Core Warning

2014-02-26 Thread Ralph Castain
It means that OMPI didn't get built against libnuma, and so we can't ensure 
that memory is being bound local to the proc binding. Check to see if numactl 
and numactl-devel are installed, or you can turn off the warning using "-mca 
hwloc_base_mem_bind_failure_action silent"


On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake  wrote:

> Hi,
> 
> I tried to run an MPI Java program with --bind-to core. I receive the 
> following warning and wonder how to fix this.
> 
> 
> WARNING: a request was made to bind a process. While the system
> supports binding the process itself, at least one node does NOT
> support binding memory to the process location.
> 
>   Node:  192.168.0.19
> 
> This is a warning only; your job will continue, though performance may
> be degraded.
> 
> 
> Thank you,
> Saliya
> 
> -- 
> Saliya Ekanayake esal...@gmail.com 
> Cell 812-391-4914 Home 812-961-6383
> http://saliya.org
> ___
> users mailing list
> us...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/users



[OMPI users] Binding to Core Warning

2014-02-26 Thread Saliya Ekanayake
Hi,

I tried to run an MPI Java program with --bind-to core. I receive the
following warning and wonder how to fix this.


WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.

  Node:  192.168.0.19

This is a warning only; your job will continue, though performance may
be degraded.


Thank you,
Saliya

-- 
Saliya Ekanayake esal...@gmail.com
Cell 812-391-4914 Home 812-961-6383
http://saliya.org