I see this error message in your output:
->[0m No HDF5 checkpoint files with basefilename 'checkpoint.chkpt'
and file extension '.h5' found in recovery directory
'nsns_toy1.2_DDME2BPS_quark_1.2vs1.6M_40km_g25'
I suspect you did a "sim submit" for a job, got a failure, and did a
second "sim submit" without purging. That immediately triggered the
error. Then, for some reason, MPI didn't shut down cleanly and the
processes hung doing nothing until they used up the walltime.
--Steve
On 4/2/2023 5:16 AM, Shamim Haque 1910511 wrote:
Hello,
I am trying to run BNSM using IllinoisGRMHD on HPC Kanad at IISER
Bhopal. While I have tested the parfile to be running fine on debug
queue (1 node) and high memory queue (3 nodes), I am unable to run the
simulation in a queue with 9 nodes (144 cores).
The output file suggests that the setup of listed thorns is not
complete within 24 hours, which is the max walltime for this queue.
Is there a way to sort out this issue? I have attached the parfile and
outfile for reference.
Regards
Shamim Haque
Senior Research Fellow (SRF)
Department of Physics
IISER Bhopal
ᐧ
_______________________________________________
Users mailing list
[email protected]
http://lists.einsteintoolkit.org/mailman/listinfo/users
_______________________________________________
Users mailing list
[email protected]
http://lists.einsteintoolkit.org/mailman/listinfo/users