I see this error message in your output:

  -> No HDF5 checkpoint files with basefilename 'checkpoint.chkpt' and file extension '.h5' found in recovery directory 'nsns_toy1.2_DDME2BPS_quark_1.2vs1.6M_40km_g25'

I suspect you did a "sim submit" for a job, got a failure, and did a second "sim submit" without purging. That immediately triggered the error. Then, for some reason, MPI didn't shut down cleanly and the processes hung doing nothing until they used up the walltime.

--Steve

On 4/2/2023 5:16 AM, Shamim Haque 1910511 wrote:
Hello,

I am trying to run BNSM using IllinoisGRMHD on HPC Kanad at IISER Bhopal. While I have tested the parfile to be running fine on debug queue (1 node) and high memory queue (3 nodes), I am unable to run the simulation in a queue with 9 nodes (144 cores).

The output file suggests that the setup of listed thorns is not complete within 24 hours, which is the max walltime for this queue.

Is there a way to sort out this issue? I have attached the parfile and outfile for reference.

Regards
Shamim Haque
Senior Research Fellow (SRF)
Department of Physics
IISER Bhopal
ᐧ

_______________________________________________
Users mailing list
[email protected]
http://lists.einsteintoolkit.org/mailman/listinfo/users
_______________________________________________
Users mailing list
[email protected]
http://lists.einsteintoolkit.org/mailman/listinfo/users

Reply via email to