Thank you for your Help. ---------------------------------------------------------------------------------------------------------------- Joshua D. Davis [email protected] Cell: (734)707-1790
Graduate Assistant Department of Chemistry Michigan State University 578 S. Shaw Lane, room 432 East Lansing, MI 48824 ----------------------------------------------------------------------------------------------------------------- On Mon, Feb 15, 2016 at 12:37 PM, Paolo Giannozzi <[email protected]> wrote: > The check that "max_seconds" have elapsed is done at the end of each > single diagonalization, so if the latter takes "many_seconds", the check > may be triggered in the worst case when "max_seconds + some_seconds" have > elapsed. Since it may take "some_more_seconds" to write data to disk, if > you are out of luck, "max_seconds + some_seconds + some_more_seconds" will > exceed the maximum allowed time by the batch queue (or, more exactly, the > time after which the batch queue realizes that you are out of time: in your > run, 86427s, or 27s more than the wall time limit, 86400). > > Unfortunately there is no way you can recover your data. And no, there is > no reliable way to ask the operating system "how much time do I have" > before starting a new diagonalization ... > > Paolo > > On Mon, Feb 15, 2016 at 6:11 PM, Joshua Davis <[email protected]> > wrote: > >> Continued... (Sent before I meant to) >> >> I did try to use disk_io = "high", but I ran into "davcio (10)" read and >> write errors so I just used the default "low" option. There were wfc file >> written in my outdir too. >> >> Below contains much of the control options I used: >> >> &CONTROL >> title = 'MgB5C2PP_NORMCON_HSE_ec140_5kp_115bnd_1Q', >> calculation = 'scf', >> pseudo_dir = './pot', >> outdir = './scratch', >> prefix = 'MgB5CPP_NC_PBE_ec140_5kp_115bnd', >> etot_conv_thr = 1.0D-5, >> forc_conv_thr = 1.0D-4, >> verbosity = 'high', >> wf_collect = .true., >> max_seconds = 84600 >> / >> >> &SYSTEM >> ibrav = 0, >> nat = 52, >> ntyp = 3, >> ecutwfc = 140, >> nspin = 1, >> occupations = 'fixed', >> nbnd = 115, >> input_dft = 'hse', >> screening_parameter = 0.106, >> nqx1 = 1, nqx2 = 1, nqx3 = 1 >> / >> >> &ELECTRONS >> mixing_beta = 0.7, >> conv_thr = 1.D-8, >> electron_maxstep = 200 >> / >> >> >> ATOMIC_SPECIES >> Mg 24.305 Mg.pbe-hgh.UPF >> B 10.81 B.pbe-hgh.UPF >> C 12.011 C.pbe-hgh.UPF >> >> >> K_POINTS (automatic) >> 5 5 5 0 0 0 >> >> The calculation ended with: >> >> 100 total processes killed (some possibly by mpirun during cleanup) >> >> in the out file, and the following was in the scheduler output file: >> >> mpirun: killing job... >> >> >> -------------------------------------------------------------------------- >> mpirun noticed that process rank 0 with PID 26679 on node scw-003 >> exited on signal 0 (Unknown signal 0). >> >> -------------------------------------------------------------------------- >> =>> PBS: job killed: walltime 86427 exceeded limit 86400 >> mpirun: abort is already in progress...hit ctrl-c again to forcibly >> terminate >> >> >> >> Other info: The system runs CentOS 6.6, and I am running QE5.3 compiled >> with ifort 13.01 >> >> Any help would be much appreciated. >> >> ---------------------------------------------------------------------------------------------------------------- >> Joshua D. Davis >> >> Graduate Assistant >> Department of Chemistry >> Michigan State University >> >> ----------------------------------------------------------------------------------------------------------------- >> >> On Mon, Feb 15, 2016 at 11:55 AM, Joshua Davis < >> [email protected]> wrote: >> >>> Dear pwscf fourm, >>> >>> I am currently trying to run an HSE calculation on my university's high >>> performance cluster. To make sure the density and wave-functions are >>> written properly before scheduled session ends I usually use max_seconds to >>> stop the calculation. The max_seconds function did stop the calculation >>> and was ended by the scheduler. Can I still use the wave-function files >>> even though the calculation did not end right? >>> >>> The default disk_io is set to the default "low". I did try to use >>> disk_io = "high", but I ran into "davcio (10)" >>> >>> >>> ---------------------------------------------------------------------------------------------------------------- >>> Joshua D. Davis >>> >>> Graduate Assistant >>> Michigan State University >>> >>> >>> ----------------------------------------------------------------------------------------------------------------- >>> >> >> >> _______________________________________________ >> Pw_forum mailing list >> [email protected] >> http://pwscf.org/mailman/listinfo/pw_forum >> > > > > -- > Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, > Univ. Udine, via delle Scienze 208, 33100 Udine, Italy > Phone +39-0432-558216, fax +39-0432-558222 > > > _______________________________________________ > Pw_forum mailing list > [email protected] > http://pwscf.org/mailman/listinfo/pw_forum >
_______________________________________________ Pw_forum mailing list [email protected] http://pwscf.org/mailman/listinfo/pw_forum
