My apologies - I should have included "--debug-daemons" for the mpirun cmd line
so that the stderr of the backend daemons would be output.
> On Aug 10, 2020, at 10:28 AM, John Duffy via users
> wrote:
>
> Thanks Ralph
>
> I will do all of that. Much appreciated.
Thanks Ralph
I will do all of that. Much appreciated.
Well, we aren't really that picky :-) While I agree with Gilles that we are
unlikely to be able to help you resolve the problem, we can give you a couple
of ideas on how to chase it down
First, be sure to build OMPI with "--enable-debug" and then try adding "--mca
oob_base_verbose 100" to you
Thanks Gilles
I realise this is “off topic”. I was hoping the Open-MPI ORTE/HNP message might
give me a clue where to look for my driver problem.
Regarding P/Q ratios, indeed P=2 & Q=16 does indeed give me better performance.
Kind regards
John,
I am not sure you will get much help here with a kernel crash caused
by a tweaked driver.
About HPL, you are more likely to get better performance with P and Q
closer (e.g. 4x8 is likely better then 2x16 or 1x32).
Also, HPL might have better performance with one MPI task per node and
a
Hi
I have generated this problem myself by tweaking the MTU of my 8 node Raspberry
Pi 4 cluster to 9000 bytes, but I would be grateful for any ideas/suggestions
on how to relate the Open-MPI ORTE message to my tweaking.
When I run HPL Linpack using my “improved” cluster, it runs quite happily