I notice Absoft's MTT runs are failing due to the change in 
bind-to-core-by-default:

   http://mtt.open-mpi.org/index.php?do_redir=2136

I asked Tony, who runs the Absoft MTT runs; he confirms that this particular 
machine has 1 socket with 2 cores (and we're running -np 4 on this machine).

1. This is an unintended consequence of the bind-to-core-by-default policy: we 
fail with "oversubscribed!" when running on a single machine for test runs like 
this.  Do we like this? 

See #3, below, for more on this.

2. Also, the error message that is displayed says:

-----
A request was made to bind to that would result in binding more
processes than cpus on a resource:

   Bind to:         CORE
   Node:            ltljoe3
   #processes:  2
   #cpus:          1
-----

Which is odd, because the command line is "mpirun -np 4 --mca btl sm,tcp,self 
./c_hello".  Any idea what's happening here?

3. Finally, we're giving a warning saying:

-----
WARNING: a request was made to bind a process. While the system
supports binding the process itself, at least one node does NOT
support binding memory to the process location.
-----

For both #1 and #3, I wonder if we shouldn't be warning if no binding was 
explicitly stated (i.e., we're just using the defaults).  Specifically, if no 
binding is specified:

- if we oversubscribe, (possibly) warn about the performance loss of 
oversubscription, and don't bind
- don't warn about lack of memory binding

Thoughts?

-- 
Jeff Squyres
jsquy...@cisco.com
For corporate legal information go to: 
http://www.cisco.com/web/about/doing_business/legal/cri/

Reply via email to