Ah, that sheds some light. There is indeed a significant change between earlier
releases and the 1.8.1 and above that might explain what he is seeing.
Specifically, we no longer hammer the cpu while in MPI_Finalize. So if 16 of
the procs are finishing early (which the output would suggest), then
On 22/08/14 10:43, Ralph Castain wrote:
> From your earlier concerns, I would have expected only to find 32 of
> them running. Was that not the case in this run?
As I understand it in his original email he mentioned that with 1.6.5
all 48 processes were running at 100% CPU and was wondering if th
I think maybe I'm misunderstanding something. This shows that all 48 procs ran
and terminated normally.
From your earlier concerns, I would have expected only to find 32 of them
running. Was that not the case in this run?
On Aug 21, 2014, at 4:57 PM, Andrej Prsa wrote:
> Whoops, jumped the g
Whoops, jumped the gun there before the process finished. I'm attaching
the new stderr output.
> Hmmm...that's even weirder. It thinks it is going to start 48 procs,
> and the binding pattern even looks right.
>
> Hate to keep bothering you, but could you ensure this is a debug
> build (i.e., was
> Hate to keep bothering you, but could you ensure this is a debug
> build (i.e., was configured with --enable-debug), and then set -mca
> odls_base_verbose 5 --leave-session-attached on the cmd line?
No bother at all -- would love to help. I recompiled 1.8.2rc4 with
debug and issued:
/usr/local/
Hmmm...that's even weirder. It thinks it is going to start 48 procs, and the
binding pattern even looks right.
Hate to keep bothering you, but could you ensure this is a debug build (i.e.,
was configured with --enable-debug), and then set -mca odls_base_verbose 5
--leave-session-attached on the
> How odd - can you run it with --display-devel-map and send that
> along? It will give us a detailed statement of where it thinks
> everything should run.
Sure thing -- please find it attached.
Cheers,
Andrej
test.std.bz2
Description: application/bzip
How odd - can you run it with --display-devel-map and send that along? It will
give us a detailed statement of where it thinks everything should run.
On Aug 21, 2014, at 2:49 PM, Andrej Prsa wrote:
> Hi Ralph,
>
> Thanks for your reply!
>
>> One thing you might want to try: add this to your
Hi Ralph,
Thanks for your reply!
> One thing you might want to try: add this to your mpirun cmd line:
>
> --display-allocation
>
> This will tell you how many slots we think we've been given on your
> cluster.
I tried that using 1.8.2rc4, this is what I get:
== ALLOCATED
One thing you might want to try: add this to your mpirun cmd line:
--display-allocation
This will tell you how many slots we think we've been given on your cluster.
On Aug 21, 2014, at 12:50 PM, Ralph Castain wrote:
> Starting early in the 1.7 series, we began to bind procs by default to cores
Starting early in the 1.7 series, we began to bind procs by default to cores
when -np <= 2, and to sockets if np > 2. Is it possible this is what you are
seeing?
On Aug 21, 2014, at 12:45 PM, Andrej Prsa wrote:
> Dear devels,
>
> I have been trying out 1.8.2rcs recently and found a show-stop
Dear devels,
I have been trying out 1.8.2rcs recently and found a show-stopping
problem on our cluster. Running any job with any number of processors
larger than 32 will always employ only 32 cores per node (our nodes
have 48 cores). We are seeing identical behavior with 1.8.2rc4,
1.8.2rc2, and 1.
Thanks Ashley !
this is now fixed in r32568
Cheers,
Gilles
On 2014/08/21 19:00, Ashley Pittman wrote:
> One potential other issue, r32555 means that any other struct members are now
> no longer zeroed, it might be worth putting a memset() or simply assigning a
> value of {0} to the struct in
One potential other issue, r32555 means that any other struct members are now
no longer zeroed, it might be worth putting a memset() or simply assigning a
value of {0} to the struct in order to preserve the old behaviour.
Ashley.
On 21 Aug 2014, at 04:31, Gilles Gouaillardet
wrote:
> Paul,
14 matches
Mail list logo