On May 30, 2012, at 9:47 AM, Mike Dubman wrote:

> ohh.. you are right, false alarm :) sorry siblings != cores - so it is HT

OMPI 1.6.soon-to-be-1 should handle HT properly, meaning that it will bind to 
all the HT's in a core and/or socket.

Are you using Linux cgroups/cpusets to restrict available cores?  Because Brice 
is saying that E5-2650 is supposed to have more cores.


> On Wed, May 30, 2012 at 4:36 PM, Brice Goglin <brice.gog...@inria.fr> wrote:
> Your /proc/cpuinfo output (filtered below) looks like only two sockets 
> (physical ids 0 and 1), with one core each (cpu cores=1, core id=0), with 
> hyperthreading (siblings=2). So lstopo looks good.
> E5-2650 is supposed to have 8 cores. I assume you use Linux cgroups/cpusets 
> to restrict the available cores. The missconfiguration may be there.
> Brice
> 
> 
> 
> 
> Le 30/05/2012 15:14, Mike Dubman a écrit :
>> or, lstopo lies (Im not using the latest hwloc but one which comes with 
>> distro).
>> The machine has two dual-code sockets, total 4 physical cores:
>> processor       : 0
>> 
>> physical id     : 0
>> siblings        : 2
>> core id         : 0
>> cpu cores       : 1
>> 
>> processor       : 1
>> 
>> physical id     : 1
>> siblings        : 2
>> core id         : 0
>> cpu cores       : 1
>> 
>> processor       : 2
>> 
>> physical id     : 0
>> siblings        : 2
>> core id         : 0
>> cpu cores       : 1
>> 
>> processor       : 3
>> 
>> physical id     : 1
>> siblings        : 2
>> core id         : 0
>> cpu cores       : 1
>> 
>> 
>> 
>> On Wed, May 30, 2012 at 3:40 PM, Ralph Castain <r...@open-mpi.org> wrote:
>> Hmmm...well, from what I see, mpirun was actually giving you the right 
>> answer! I only see TWO cores on each node, yet you told it to bind FOUR 
>> processes on each node, each proc to be bound to a unique core.
>> 
>> The error message was correct - there are not enough cores on those nodes to 
>> do what you requested.
>> 
>> 
>> On May 30, 2012, at 6:19 AM, Mike Dubman wrote:
>> 
>>> attached.
>>> 
>>> On Wed, May 30, 2012 at 2:32 PM, Jeff Squyres <jsquy...@cisco.com> wrote:
>>> On May 30, 2012, at 7:20 AM, Jeff Squyres wrote:
>>> 
>>> >> $hwloc-ls --of console
>>> >> Machine (32GB)
>>> >>  NUMANode L#0 (P#0 16GB) + Socket L#0 + L3 L#0 (20MB) + L2 L#0 (256KB) + 
>>> >> L1 L#0 (32KB) + Core L#0
>>> >>    PU L#0 (P#0)
>>> >>    PU L#1 (P#2)
>>> >>  NUMANode L#1 (P#1 16GB) + Socket L#1 + L3 L#1 (20MB) + L2 L#1 (256KB) + 
>>> >> L1 L#1 (32KB) + Core L#1
>>> >>    PU L#2 (P#1)
>>> >>    PU L#3 (P#3)
>>> >
>>> > Is this hwloc output exactly the same on both nodes?
>>> 
>>> 
>>> More specifically, can you send the lstopo xml output from each of the 2 
>>> nodes you ran on?
>>> 
>>> --
>>> Jeff Squyres
>>> jsquy...@cisco.com
>>> For corporate legal information go to: 
>>> http://www.cisco.com/web/about/doing_business/legal/cri/
>>> 
>>> 
>>> _______________________________________________
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>>> 
>>> <lstopo-out.tbz>_______________________________________________
>>> 
>>> devel mailing list
>>> de...@open-mpi.org
>>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
>> 
>> 
>> 
>> _______________________________________________
>> devel mailing list
>> 
>> de...@open-mpi.org
>> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel
> 
> _______________________________________________
> devel mailing list
> de...@open-mpi.org
> http://www.open-mpi.org/mailman/listinfo.cgi/devel


-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/


Reply via email to