Re: [OMPI users] Binding to Core Warning
Thank you. Anyway, your email contains good amount of info. Saliya On Wed, Feb 26, 2014 at 7:48 PM, Ralph Castainwrote: > I did one "chapter" of it on Jeff's blog and probably should complete it. > Definitely need to update the FAQ for the new options. > > Sadly, outside of that and the mpirun man page, there isn't much available > yet. I'm woefully far behind on it. > > > On Feb 26, 2014, at 4:47 PM, Saliya Ekanayake wrote: > > Thank you Ralph, this is very insightful and I think I can better > understand performance of our application. > > If I may ask, is there a document describing this affinity options? I've > been looking at tuning FAQ and Jeff's blog posts. > > Thank you, > Saliya > > > On Wed, Feb 26, 2014 at 7:34 PM, Ralph Castain wrote: > >> >> On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake wrote: >> >> I see, so if I understand correctly, the best scenario for threads would >> be to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in >> each proc. >> >> >> Yes, that would be the best solution. If you have 4 cores in each socket, >> then just bind each proc to the socket: >> >> --map-by socket --bind-to socket >> >> If you want to put one proc on each socket by itself, then do >> >> --map-by ppr:1:socket --bind-to socket >> >> >> >> Also, as you've mentioned binding threads to get memory locality, I guess >> this has to be done at application level and not an option in OMPI >> >> >> Sadly yes - the problem is that MPI lacks an init call for each thread, >> and so we don't see the threads being started. You can use hwloc to bind >> each thread, but it has to be done in the app itself. >> >> >> Thank you, >> Saliya >> >> >> On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain wrote: >> >>> Sorry, had to run some errands. >>> >>> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake wrote: >>> >>> Is it possible to bind to cores of multiple sockets? Say I have a >>> machine with 2 sockets each with 4 cores and if I run 8 threads with 1 proc >>> can I utilize all 8 cores for 8 threads? >>> >>> >>> In that scenario, you won't get any benefit from binding as we only bind >>> at the proc level (and binding to the entire node does nothing). You might >>> want to bind your threads, however, as otherwise the threads will not >>> necessarily execute local to any memory they malloc. >>> >>> >>> Thank you for speedy replies >>> >>> Saliya >>> >>> >>> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain wrote: >>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake wrote: I have a followup question on this. In our application we have parallel for loops similar to OMP parallel for. I noticed that in order to gain speedup with threads I've to set --bind-to none, otherwise multiple threads will bind to same core giving no increase in performance. For example, I get following (attached) performance for a simple 3point stencil computation run with T threads on 1 MPI process on 1 node (Tx1x1). My understanding is even when there are multiple procs per node we should use --bind-to none in order to get performance with threads. Is this correct? Also, what's the disadvantage of not using --bind-to core? Your best performance with threads comes when you bind each process to multiple cores. Binding helps performance by ensuring your memory is always local, and provides some optimized scheduling benefits. You can bind to multiple cores by adding the qualifier "pe=N" to your mapping definition, like this: mpirun --map-by socket:pe=4 The above example will map processes by socket, and bind each process to 4 cores. HTH Ralph Thank you, Saliya On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote: > Thank you Ralph, I'll check this. > > > On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote: > >> It means that OMPI didn't get built against libnuma, and so we can't >> ensure that memory is being bound local to the proc binding. Check to see >> if numactl and numactl-devel are installed, or you can turn off the >> warning >> using "-mca hwloc_base_mem_bind_failure_action silent" >> >> >> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake >> wrote: >> >> Hi, >> >> I tried to run an MPI Java program with --bind-to core. I receive the >> following warning and wonder how to fix this. >> >> >> WARNING: a request was made to bind a process. While the system >> supports binding the process itself, at least one node does NOT >> support binding memory to the process location. >> >> Node: 192.168.0.19 >> >> This is a
Re: [OMPI users] Binding to Core Warning
I did one "chapter" of it on Jeff's blog and probably should complete it. Definitely need to update the FAQ for the new options. Sadly, outside of that and the mpirun man page, there isn't much available yet. I'm woefully far behind on it. On Feb 26, 2014, at 4:47 PM, Saliya Ekanayakewrote: > Thank you Ralph, this is very insightful and I think I can better understand > performance of our application. > > If I may ask, is there a document describing this affinity options? I've been > looking at tuning FAQ and Jeff's blog posts. > > Thank you, > Saliya > > > On Wed, Feb 26, 2014 at 7:34 PM, Ralph Castain wrote: > > On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake wrote: > >> I see, so if I understand correctly, the best scenario for threads would be >> to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in each >> proc. > > Yes, that would be the best solution. If you have 4 cores in each socket, > then just bind each proc to the socket: > > --map-by socket --bind-to socket > > If you want to put one proc on each socket by itself, then do > > --map-by ppr:1:socket --bind-to socket > > >> >> Also, as you've mentioned binding threads to get memory locality, I guess >> this has to be done at application level and not an option in OMPI > > Sadly yes - the problem is that MPI lacks an init call for each thread, and > so we don't see the threads being started. You can use hwloc to bind each > thread, but it has to be done in the app itself. > >> >> Thank you, >> Saliya >> >> >> On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain wrote: >> Sorry, had to run some errands. >> >> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake wrote: >> >>> Is it possible to bind to cores of multiple sockets? Say I have a machine >>> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I >>> utilize all 8 cores for 8 threads? >> >> In that scenario, you won't get any benefit from binding as we only bind at >> the proc level (and binding to the entire node does nothing). You might want >> to bind your threads, however, as otherwise the threads will not necessarily >> execute local to any memory they malloc. >> >>> >>> Thank you for speedy replies >>> >>> Saliya >>> >>> >>> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain wrote: >>> >>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake wrote: >>> I have a followup question on this. In our application we have parallel for loops similar to OMP parallel for. I noticed that in order to gain speedup with threads I've to set --bind-to none, otherwise multiple threads will bind to same core giving no increase in performance. For example, I get following (attached) performance for a simple 3point stencil computation run with T threads on 1 MPI process on 1 node (Tx1x1). My understanding is even when there are multiple procs per node we should use --bind-to none in order to get performance with threads. Is this correct? Also, what's the disadvantage of not using --bind-to core? >>> >>> Your best performance with threads comes when you bind each process to >>> multiple cores. Binding helps performance by ensuring your memory is always >>> local, and provides some optimized scheduling benefits. You can bind to >>> multiple cores by adding the qualifier "pe=N" to your mapping definition, >>> like this: >>> >>> mpirun --map-by socket:pe=4 >>> >>> The above example will map processes by socket, and bind each process to 4 >>> cores. >>> >>> HTH >>> Ralph >>> Thank you, Saliya On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote: Thank you Ralph, I'll check this. On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote: It means that OMPI didn't get built against libnuma, and so we can't ensure that memory is being bound local to the proc binding. Check to see if numactl and numactl-devel are installed, or you can turn off the warning using "-mca hwloc_base_mem_bind_failure_action silent" On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake wrote: > Hi, > > I tried to run an MPI Java program with --bind-to core. I receive the > following warning and wonder how to fix this. > > > WARNING: a request was made to bind a process. While the system > supports binding the process itself, at least one node does NOT > support binding memory to the process location. > > Node: 192.168.0.19 > > This is a warning only; your job will continue, though performance may > be degraded. > > > Thank you, > Saliya > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home
Re: [OMPI users] Binding to Core Warning
Thank you Ralph, this is very insightful and I think I can better understand performance of our application. If I may ask, is there a document describing this affinity options? I've been looking at tuning FAQ and Jeff's blog posts. Thank you, Saliya On Wed, Feb 26, 2014 at 7:34 PM, Ralph Castainwrote: > > On Feb 26, 2014, at 4:29 PM, Saliya Ekanayake wrote: > > I see, so if I understand correctly, the best scenario for threads would > be to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in > each proc. > > > Yes, that would be the best solution. If you have 4 cores in each socket, > then just bind each proc to the socket: > > --map-by socket --bind-to socket > > If you want to put one proc on each socket by itself, then do > > --map-by ppr:1:socket --bind-to socket > > > > Also, as you've mentioned binding threads to get memory locality, I guess > this has to be done at application level and not an option in OMPI > > > Sadly yes - the problem is that MPI lacks an init call for each thread, > and so we don't see the threads being started. You can use hwloc to bind > each thread, but it has to be done in the app itself. > > > Thank you, > Saliya > > > On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain wrote: > >> Sorry, had to run some errands. >> >> On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake wrote: >> >> Is it possible to bind to cores of multiple sockets? Say I have a machine >> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I >> utilize all 8 cores for 8 threads? >> >> >> In that scenario, you won't get any benefit from binding as we only bind >> at the proc level (and binding to the entire node does nothing). You might >> want to bind your threads, however, as otherwise the threads will not >> necessarily execute local to any memory they malloc. >> >> >> Thank you for speedy replies >> >> Saliya >> >> >> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain wrote: >> >>> >>> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake >>> wrote: >>> >>> I have a followup question on this. In our application we have parallel >>> for loops similar to OMP parallel for. I noticed that in order to gain >>> speedup with threads I've to set --bind-to none, otherwise multiple threads >>> will bind to same core giving no increase in performance. For example, I >>> get following (attached) performance for a simple 3point stencil >>> computation run with T threads on 1 MPI process on 1 node (Tx1x1). >>> >>> My understanding is even when there are multiple procs per node we >>> should use --bind-to none in order to get performance with threads. Is this >>> correct? Also, what's the disadvantage of not using --bind-to core? >>> >>> >>> Your best performance with threads comes when you bind each process to >>> multiple cores. Binding helps performance by ensuring your memory is always >>> local, and provides some optimized scheduling benefits. You can bind to >>> multiple cores by adding the qualifier "pe=N" to your mapping definition, >>> like this: >>> >>> mpirun --map-by socket:pe=4 >>> >>> The above example will map processes by socket, and bind each process to >>> 4 cores. >>> >>> HTH >>> Ralph >>> >>> >>> Thank you, >>> Saliya >>> >>> >>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote: >>> Thank you Ralph, I'll check this. On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote: > It means that OMPI didn't get built against libnuma, and so we can't > ensure that memory is being bound local to the proc binding. Check to see > if numactl and numactl-devel are installed, or you can turn off the > warning > using "-mca hwloc_base_mem_bind_failure_action silent" > > > On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake > wrote: > > Hi, > > I tried to run an MPI Java program with --bind-to core. I receive the > following warning and wonder how to fix this. > > > WARNING: a request was made to bind a process. While the system > supports binding the process itself, at least one node does NOT > support binding memory to the process location. > > Node: 192.168.0.19 > > This is a warning only; your job will continue, though performance may > be degraded. > > > Thank you, > Saliya > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home 812-961-6383 > http://saliya.org > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Saliya
Re: [OMPI users] Binding to Core Warning
On Feb 26, 2014, at 4:29 PM, Saliya Ekanayakewrote: > I see, so if I understand correctly, the best scenario for threads would be > to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in each > proc. Yes, that would be the best solution. If you have 4 cores in each socket, then just bind each proc to the socket: --map-by socket --bind-to socket If you want to put one proc on each socket by itself, then do --map-by ppr:1:socket --bind-to socket > > Also, as you've mentioned binding threads to get memory locality, I guess > this has to be done at application level and not an option in OMPI Sadly yes - the problem is that MPI lacks an init call for each thread, and so we don't see the threads being started. You can use hwloc to bind each thread, but it has to be done in the app itself. > > Thank you, > Saliya > > > On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castain wrote: > Sorry, had to run some errands. > > On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake wrote: > >> Is it possible to bind to cores of multiple sockets? Say I have a machine >> with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I >> utilize all 8 cores for 8 threads? > > In that scenario, you won't get any benefit from binding as we only bind at > the proc level (and binding to the entire node does nothing). You might want > to bind your threads, however, as otherwise the threads will not necessarily > execute local to any memory they malloc. > >> >> Thank you for speedy replies >> >> Saliya >> >> >> On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain wrote: >> >> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake wrote: >> >>> I have a followup question on this. In our application we have parallel for >>> loops similar to OMP parallel for. I noticed that in order to gain speedup >>> with threads I've to set --bind-to none, otherwise multiple threads will >>> bind to same core giving no increase in performance. For example, I get >>> following (attached) performance for a simple 3point stencil computation >>> run with T threads on 1 MPI process on 1 node (Tx1x1). >>> >>> My understanding is even when there are multiple procs per node we should >>> use --bind-to none in order to get performance with threads. Is this >>> correct? Also, what's the disadvantage of not using --bind-to core? >> >> Your best performance with threads comes when you bind each process to >> multiple cores. Binding helps performance by ensuring your memory is always >> local, and provides some optimized scheduling benefits. You can bind to >> multiple cores by adding the qualifier "pe=N" to your mapping definition, >> like this: >> >> mpirun --map-by socket:pe=4 >> >> The above example will map processes by socket, and bind each process to 4 >> cores. >> >> HTH >> Ralph >> >>> >>> Thank you, >>> Saliya >>> >>> >>> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake >>> wrote: >>> Thank you Ralph, I'll check this. >>> >>> >>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote: >>> It means that OMPI didn't get built against libnuma, and so we can't ensure >>> that memory is being bound local to the proc binding. Check to see if >>> numactl and numactl-devel are installed, or you can turn off the warning >>> using "-mca hwloc_base_mem_bind_failure_action silent" >>> >>> >>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake wrote: >>> Hi, I tried to run an MPI Java program with --bind-to core. I receive the following warning and wonder how to fix this. WARNING: a request was made to bind a process. While the system supports binding the process itself, at least one node does NOT support binding memory to the process location. Node: 192.168.0.19 This is a warning only; your job will continue, though performance may be degraded. Thank you, Saliya -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> >>> -- >>> Saliya Ekanayake esal...@gmail.com >>> Cell 812-391-4914 Home 812-961-6383 >>> http://saliya.org >>> >>> >>> >>> -- >>> Saliya Ekanayake esal...@gmail.com >>> Cell 812-391-4914 Home 812-961-6383 >>> http://saliya.org >>> <3pointstencil.png>___ >>> >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >>
Re: [OMPI users] Binding to Core Warning
I see, so if I understand correctly, the best scenario for threads would be to bind 2 procs to sockets as --map-by socket:pe=4 and use 4 threads in each proc. Also, as you've mentioned binding threads to get memory locality, I guess this has to be done at application level and not an option in OMPI Thank you, Saliya On Wed, Feb 26, 2014 at 4:50 PM, Ralph Castainwrote: > Sorry, had to run some errands. > > On Feb 26, 2014, at 1:03 PM, Saliya Ekanayake wrote: > > Is it possible to bind to cores of multiple sockets? Say I have a machine > with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I > utilize all 8 cores for 8 threads? > > > In that scenario, you won't get any benefit from binding as we only bind > at the proc level (and binding to the entire node does nothing). You might > want to bind your threads, however, as otherwise the threads will not > necessarily execute local to any memory they malloc. > > > Thank you for speedy replies > > Saliya > > > On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain wrote: > >> >> On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake wrote: >> >> I have a followup question on this. In our application we have parallel >> for loops similar to OMP parallel for. I noticed that in order to gain >> speedup with threads I've to set --bind-to none, otherwise multiple threads >> will bind to same core giving no increase in performance. For example, I >> get following (attached) performance for a simple 3point stencil >> computation run with T threads on 1 MPI process on 1 node (Tx1x1). >> >> My understanding is even when there are multiple procs per node we should >> use --bind-to none in order to get performance with threads. Is this >> correct? Also, what's the disadvantage of not using --bind-to core? >> >> >> Your best performance with threads comes when you bind each process to >> multiple cores. Binding helps performance by ensuring your memory is always >> local, and provides some optimized scheduling benefits. You can bind to >> multiple cores by adding the qualifier "pe=N" to your mapping definition, >> like this: >> >> mpirun --map-by socket:pe=4 >> >> The above example will map processes by socket, and bind each process to >> 4 cores. >> >> HTH >> Ralph >> >> >> Thank you, >> Saliya >> >> >> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote: >> >>> Thank you Ralph, I'll check this. >>> >>> >>> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote: >>> It means that OMPI didn't get built against libnuma, and so we can't ensure that memory is being bound local to the proc binding. Check to see if numactl and numactl-devel are installed, or you can turn off the warning using "-mca hwloc_base_mem_bind_failure_action silent" On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake wrote: Hi, I tried to run an MPI Java program with --bind-to core. I receive the following warning and wonder how to fix this. WARNING: a request was made to bind a process. While the system supports binding the process itself, at least one node does NOT support binding memory to the process location. Node: 192.168.0.19 This is a warning only; your job will continue, though performance may be degraded. Thank you, Saliya -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> >>> -- >>> Saliya Ekanayake esal...@gmail.com >>> Cell 812-391-4914 Home 812-961-6383 >>> http://saliya.org >>> >> >> >> >> -- >> Saliya Ekanayake esal...@gmail.com >> Cell 812-391-4914 Home 812-961-6383 >> http://saliya.org >> <3pointstencil.png>___ >> >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home 812-961-6383 > http://saliya.org > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383
Re: [OMPI users] Binding to Core Warning
Sorry, had to run some errands. On Feb 26, 2014, at 1:03 PM, Saliya Ekanayakewrote: > Is it possible to bind to cores of multiple sockets? Say I have a machine > with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I > utilize all 8 cores for 8 threads? In that scenario, you won't get any benefit from binding as we only bind at the proc level (and binding to the entire node does nothing). You might want to bind your threads, however, as otherwise the threads will not necessarily execute local to any memory they malloc. > > Thank you for speedy replies > > Saliya > > > On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castain wrote: > > On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake wrote: > >> I have a followup question on this. In our application we have parallel for >> loops similar to OMP parallel for. I noticed that in order to gain speedup >> with threads I've to set --bind-to none, otherwise multiple threads will >> bind to same core giving no increase in performance. For example, I get >> following (attached) performance for a simple 3point stencil computation run >> with T threads on 1 MPI process on 1 node (Tx1x1). >> >> My understanding is even when there are multiple procs per node we should >> use --bind-to none in order to get performance with threads. Is this >> correct? Also, what's the disadvantage of not using --bind-to core? > > Your best performance with threads comes when you bind each process to > multiple cores. Binding helps performance by ensuring your memory is always > local, and provides some optimized scheduling benefits. You can bind to > multiple cores by adding the qualifier "pe=N" to your mapping definition, > like this: > > mpirun --map-by socket:pe=4 > > The above example will map processes by socket, and bind each process to 4 > cores. > > HTH > Ralph > >> >> Thank you, >> Saliya >> >> >> On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote: >> Thank you Ralph, I'll check this. >> >> >> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote: >> It means that OMPI didn't get built against libnuma, and so we can't ensure >> that memory is being bound local to the proc binding. Check to see if >> numactl and numactl-devel are installed, or you can turn off the warning >> using "-mca hwloc_base_mem_bind_failure_action silent" >> >> >> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake wrote: >> >>> Hi, >>> >>> I tried to run an MPI Java program with --bind-to core. I receive the >>> following warning and wonder how to fix this. >>> >>> >>> WARNING: a request was made to bind a process. While the system >>> supports binding the process itself, at least one node does NOT >>> support binding memory to the process location. >>> >>> Node: 192.168.0.19 >>> >>> This is a warning only; your job will continue, though performance may >>> be degraded. >>> >>> >>> Thank you, >>> Saliya >>> >>> -- >>> Saliya Ekanayake esal...@gmail.com >>> Cell 812-391-4914 Home 812-961-6383 >>> http://saliya.org >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> -- >> Saliya Ekanayake esal...@gmail.com >> Cell 812-391-4914 Home 812-961-6383 >> http://saliya.org >> >> >> >> -- >> Saliya Ekanayake esal...@gmail.com >> Cell 812-391-4914 Home 812-961-6383 >> http://saliya.org >> <3pointstencil.png>___ >> >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home 812-961-6383 > http://saliya.org > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Binding to Core Warning
Is it possible to bind to cores of multiple sockets? Say I have a machine with 2 sockets each with 4 cores and if I run 8 threads with 1 proc can I utilize all 8 cores for 8 threads? Thank you for speedy replies Saliya On Wed, Feb 26, 2014 at 3:21 PM, Ralph Castainwrote: > > On Feb 26, 2014, at 12:17 PM, Saliya Ekanayake wrote: > > I have a followup question on this. In our application we have parallel > for loops similar to OMP parallel for. I noticed that in order to gain > speedup with threads I've to set --bind-to none, otherwise multiple threads > will bind to same core giving no increase in performance. For example, I > get following (attached) performance for a simple 3point stencil > computation run with T threads on 1 MPI process on 1 node (Tx1x1). > > My understanding is even when there are multiple procs per node we should > use --bind-to none in order to get performance with threads. Is this > correct? Also, what's the disadvantage of not using --bind-to core? > > > Your best performance with threads comes when you bind each process to > multiple cores. Binding helps performance by ensuring your memory is always > local, and provides some optimized scheduling benefits. You can bind to > multiple cores by adding the qualifier "pe=N" to your mapping definition, > like this: > > mpirun --map-by socket:pe=4 > > The above example will map processes by socket, and bind each process to 4 > cores. > > HTH > Ralph > > > Thank you, > Saliya > > > On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote: > >> Thank you Ralph, I'll check this. >> >> >> On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote: >> >>> It means that OMPI didn't get built against libnuma, and so we can't >>> ensure that memory is being bound local to the proc binding. Check to see >>> if numactl and numactl-devel are installed, or you can turn off the warning >>> using "-mca hwloc_base_mem_bind_failure_action silent" >>> >>> >>> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake >>> wrote: >>> >>> Hi, >>> >>> I tried to run an MPI Java program with --bind-to core. I receive the >>> following warning and wonder how to fix this. >>> >>> >>> WARNING: a request was made to bind a process. While the system >>> supports binding the process itself, at least one node does NOT >>> support binding memory to the process location. >>> >>> Node: 192.168.0.19 >>> >>> This is a warning only; your job will continue, though performance may >>> be degraded. >>> >>> >>> Thank you, >>> Saliya >>> >>> -- >>> Saliya Ekanayake esal...@gmail.com >>> Cell 812-391-4914 Home 812-961-6383 >>> http://saliya.org >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >>> >>> >>> ___ >>> users mailing list >>> us...@open-mpi.org >>> http://www.open-mpi.org/mailman/listinfo.cgi/users >>> >> >> >> >> -- >> Saliya Ekanayake esal...@gmail.com >> Cell 812-391-4914 Home 812-961-6383 >> http://saliya.org >> > > > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home 812-961-6383 > http://saliya.org > <3pointstencil.png>___ > > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org
Re: [OMPI users] Binding to Core Warning
On Feb 26, 2014, at 12:17 PM, Saliya Ekanayakewrote: > I have a followup question on this. In our application we have parallel for > loops similar to OMP parallel for. I noticed that in order to gain speedup > with threads I've to set --bind-to none, otherwise multiple threads will bind > to same core giving no increase in performance. For example, I get following > (attached) performance for a simple 3point stencil computation run with T > threads on 1 MPI process on 1 node (Tx1x1). > > My understanding is even when there are multiple procs per node we should use > --bind-to none in order to get performance with threads. Is this correct? > Also, what's the disadvantage of not using --bind-to core? Your best performance with threads comes when you bind each process to multiple cores. Binding helps performance by ensuring your memory is always local, and provides some optimized scheduling benefits. You can bind to multiple cores by adding the qualifier "pe=N" to your mapping definition, like this: mpirun --map-by socket:pe=4 The above example will map processes by socket, and bind each process to 4 cores. HTH Ralph > > Thank you, > Saliya > > > On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayake wrote: > Thank you Ralph, I'll check this. > > > On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote: > It means that OMPI didn't get built against libnuma, and so we can't ensure > that memory is being bound local to the proc binding. Check to see if numactl > and numactl-devel are installed, or you can turn off the warning using "-mca > hwloc_base_mem_bind_failure_action silent" > > > On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake wrote: > >> Hi, >> >> I tried to run an MPI Java program with --bind-to core. I receive the >> following warning and wonder how to fix this. >> >> >> WARNING: a request was made to bind a process. While the system >> supports binding the process itself, at least one node does NOT >> support binding memory to the process location. >> >> Node: 192.168.0.19 >> >> This is a warning only; your job will continue, though performance may >> be degraded. >> >> >> Thank you, >> Saliya >> >> -- >> Saliya Ekanayake esal...@gmail.com >> Cell 812-391-4914 Home 812-961-6383 >> http://saliya.org >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home 812-961-6383 > http://saliya.org > > > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home 812-961-6383 > http://saliya.org > <3pointstencil.png>___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Binding to Core Warning
I have a followup question on this. In our application we have parallel for loops similar to OMP parallel for. I noticed that in order to gain speedup with threads I've to set --bind-to none, otherwise multiple threads will bind to same core giving no increase in performance. For example, I get following (attached) performance for a simple 3point stencil computation run with T threads on 1 MPI process on 1 node (Tx1x1). My understanding is even when there are multiple procs per node we should use --bind-to none in order to get performance with threads. Is this correct? Also, what's the disadvantage of not using --bind-to core? Thank you, Saliya On Wed, Feb 26, 2014 at 11:01 AM, Saliya Ekanayakewrote: > Thank you Ralph, I'll check this. > > > On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castain wrote: > >> It means that OMPI didn't get built against libnuma, and so we can't >> ensure that memory is being bound local to the proc binding. Check to see >> if numactl and numactl-devel are installed, or you can turn off the warning >> using "-mca hwloc_base_mem_bind_failure_action silent" >> >> >> On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake wrote: >> >> Hi, >> >> I tried to run an MPI Java program with --bind-to core. I receive the >> following warning and wonder how to fix this. >> >> >> WARNING: a request was made to bind a process. While the system >> supports binding the process itself, at least one node does NOT >> support binding memory to the process location. >> >> Node: 192.168.0.19 >> >> This is a warning only; your job will continue, though performance may >> be degraded. >> >> >> Thank you, >> Saliya >> >> -- >> Saliya Ekanayake esal...@gmail.com >> Cell 812-391-4914 Home 812-961-6383 >> http://saliya.org >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> >> >> >> ___ >> users mailing list >> us...@open-mpi.org >> http://www.open-mpi.org/mailman/listinfo.cgi/users >> > > > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home 812-961-6383 > http://saliya.org > -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org
Re: [OMPI users] Binding to Core Warning
Thank you Ralph, I'll check this. On Wed, Feb 26, 2014 at 10:04 AM, Ralph Castainwrote: > It means that OMPI didn't get built against libnuma, and so we can't > ensure that memory is being bound local to the proc binding. Check to see > if numactl and numactl-devel are installed, or you can turn off the warning > using "-mca hwloc_base_mem_bind_failure_action silent" > > > On Feb 25, 2014, at 10:32 PM, Saliya Ekanayake wrote: > > Hi, > > I tried to run an MPI Java program with --bind-to core. I receive the > following warning and wonder how to fix this. > > > WARNING: a request was made to bind a process. While the system > supports binding the process itself, at least one node does NOT > support binding memory to the process location. > > Node: 192.168.0.19 > > This is a warning only; your job will continue, though performance may > be degraded. > > > Thank you, > Saliya > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home 812-961-6383 > http://saliya.org > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > > > > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users > -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org
Re: [OMPI users] Binding to Core Warning
It means that OMPI didn't get built against libnuma, and so we can't ensure that memory is being bound local to the proc binding. Check to see if numactl and numactl-devel are installed, or you can turn off the warning using "-mca hwloc_base_mem_bind_failure_action silent" On Feb 25, 2014, at 10:32 PM, Saliya Ekanayakewrote: > Hi, > > I tried to run an MPI Java program with --bind-to core. I receive the > following warning and wonder how to fix this. > > > WARNING: a request was made to bind a process. While the system > supports binding the process itself, at least one node does NOT > support binding memory to the process location. > > Node: 192.168.0.19 > > This is a warning only; your job will continue, though performance may > be degraded. > > > Thank you, > Saliya > > -- > Saliya Ekanayake esal...@gmail.com > Cell 812-391-4914 Home 812-961-6383 > http://saliya.org > ___ > users mailing list > us...@open-mpi.org > http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] Binding to Core Warning
Hi, I tried to run an MPI Java program with --bind-to core. I receive the following warning and wonder how to fix this. WARNING: a request was made to bind a process. While the system supports binding the process itself, at least one node does NOT support binding memory to the process location. Node: 192.168.0.19 This is a warning only; your job will continue, though performance may be degraded. Thank you, Saliya -- Saliya Ekanayake esal...@gmail.com Cell 812-391-4914 Home 812-961-6383 http://saliya.org