Hi Ian,
thanks for your quick reply!
There are a couple of other guys here who have successfully used the
toolkit with this version of openmpi, so I'm not sure if this is at
fault. There is not another version of openmpi available currently on
the cluster, but I could ask for this to be remedied.
I have tried using procs=48 and num-threads=2 but I find the same problem.
As you say, I recompiled a clean config to double check. I have also
used the May 16 version but to no avail.
Could it be something to do with compilation, that has gone wrong
somehow, even though it finishes successfully with done?
Cheers,
Chris
On 05/15/2017 09:10 PM, Ian Hinder wrote:
On 15 May 2017, at 16:49, Chris Stevens <[email protected]
<mailto:[email protected]>> wrote:
Hi there,
I am new to Cactus, and have been having trouble getting the
qc0-mclachlan.par test file to run. I have compiled the latest
version of Cactus successfully on the CHPC cluster in South Africa.
I have attached .out and .err files for the run, along with my
machine file, optionlist file, run and submit scripts. The submit
command was
./simfactory/bin/sim submit mctest
--configuration=mclachlantest_mpidebug
--parfile=par/qc0-mclachlan.par --procs=240 --num-threads=12
--walltime=10:0:0 --queue=normal --machine=lengau-intel
Using --mca orte_base_help_aggregate 0 in the mpirun command in the
runscript, the error is:
[cnode0823:136405] *** An error occurred in MPI_Comm_create_keyval
[cnode0823:136405] *** reported by process [476512257,3]
[cnode0823:136405] *** on communicator MPI_COMM_WORLD
[cnode0823:136405] *** MPI_ERR_ARG: invalid argument of some other kind
[cnode0823:136405] *** MPI_ERRORS_ARE_FATAL (processes in this
communicator will now abort,
[cnode0823:136405] *** and potentially your MPI job)
I unfortunately have no idea where to go from here, and some help
would be greatly appreciated! I hope I have attached enough information.
Hi Chris,
Welcome to Cactus! (meant in a friendly sense, not sarcastic!)
I cannot see anything wrong, and I've never seen this error before.
It's a mystery. Have you tried running on fewer MPI processes? I
wonder if something is going wrong because the problem size is too
small for the number of processes. This *shouldn't* cause a problem,
but it's something to try. Is there another MPI implementation, or
another version of OpenMPI, available on the machine? Maybe it's a
bug in OpenMPI 1.8.8.
When faced with such strange behaviour, it's always worth wiping the
configuration and rebuilding it, just in case it was not built
cleanly. e.g. when developing the optionlist, maybe you partially
built the configuration, then corrected something, and the resulting
configuration has a mixture of the two versions. You can do this with
rm -rf configs/mclachlantest_mpidebug, though I suspect that the debug
version was built cleanly, so this is unlikely to be the problem.
--
Ian Hinder
http://members.aei.mpg.de/ianhin
--
Dr Chris Stevens
Department of Mathematics
Rhodes University
Room 5
Ph: +27 46 603 8932
_______________________________________________
Users mailing list
[email protected]
http://lists.einsteintoolkit.org/mailman/listinfo/users