Thank you so much, Paolo. I will report your feedback to the IT staff in charge of maintaining the cluster I use and see of they have any suggestion.
All the best, Martina On Thu, Aug 30, 2018 at 1:20 PM Paolo Giannozzi <[email protected]> wrote: > In my opinion, if everything works on less than 16 processors and nothing > on more than 16, there is something wrong with your MPI environment > > On Thu, Aug 30, 2018 at 6:49 PM, Martina Lessio <[email protected]> > wrote: > >> Dear Paolo, >> >> Thanks for testing my input file. I guess this means there is something >> wrong with how I compiled the code, although it's hard to understand why >> the error only occurs when I submit my job on more than 16 processors. >> >> Thanks again for your time. >> >> All the best, >> Martina >> >> On Thu, Aug 30, 2018 at 12:43 PM Paolo Giannozzi <[email protected]> >> wrote: >> >>> It works for me, on 18, 24, 32 processors, at least for the development >>> version. I ran it on a 16-processor machine, but it doesn't matter how many >>> physical cores one has (the code knows nothing about the actual number of >>> cores, only about the number of MPI processes) >>> >>> Paolo >>> >>> On Thu, Aug 30, 2018 at 6:23 PM, Martina Lessio <[email protected]> >>> wrote: >>> >>>> Dear Paolo, >>>> >>>> Apologies for not including those details. The sample error message >>>> reported in my previous email was the result of a calculation run on 18 cpu >>>> (but I get similar messages when running on other cpu numbers larger than >>>> 16) using the following submission command: >>>> mpirun -np 18 pw.x < MoTe2_opt.in >>>> >>>> Thank you in advance for your help. >>>> >>>> All the best, >>>> Martina >>>> >>>> On Thu, Aug 30, 2018 at 12:09 PM Paolo Giannozzi <[email protected]> >>>> wrote: >>>> >>>>> Please report the exact conditions under which you are running the >>>>> 24-processor case: something like >>>>> mpirun -np 24 pw.x -nk .. -nd .. -whatever_option >>>>> >>>>> Paolo >>>>> >>>>> On Wed, Aug 29, 2018 at 11:49 PM, Martina Lessio <[email protected]> >>>>> wrote: >>>>> >>>>>> Dear all, >>>>>> >>>>>> I have been successfully using QE 5.4 for a while now but recently >>>>>> decided to install the newest version hoping that some issues I have been >>>>>> experiencing with 5.4 would be resolved. However, I now have some issues >>>>>> when running version 6.3 in parallel. In particular, if I run a sample >>>>>> calculation (input file provided below) on more than 16 processors the >>>>>> calculation crashes after printing this line "Starting wfcs are random" >>>>>> and >>>>>> the following error message is printed in the output file: >>>>>> [compute-0-5.local:5241] *** An error occurred in MPI_Bcast >>>>>> [compute-0-5.local:5241] *** on communicator MPI COMMUNICATOR 20 >>>>>> SPLIT FROM 18 >>>>>> [compute-0-5.local:5241] *** MPI_ERR_TRUNCATE: message truncated >>>>>> [compute-0-5.local:5241] *** MPI_ERRORS_ARE_FATAL: your MPI job will >>>>>> now abort >>>>>> >>>>>> -------------------------------------------------------------------------- >>>>>> mpirun has exited due to process rank 16 with PID 5243 on >>>>>> node compute-0-5.local exiting improperly. There are two reasons this >>>>>> could occur: >>>>>> >>>>>> 1. this process did not call "init" before exiting, but others in >>>>>> the job did. This can cause a job to hang indefinitely while it waits >>>>>> for all processes to call "init". By rule, if one process calls >>>>>> "init", >>>>>> then ALL processes must call "init" prior to termination. >>>>>> >>>>>> 2. this process called "init", but exited without calling "finalize". >>>>>> By rule, all processes that call "init" MUST call "finalize" prior to >>>>>> exiting or it will be considered an "abnormal termination" >>>>>> >>>>>> This may have caused other processes in the application to be >>>>>> terminated by signals sent by mpirun (as reported here). >>>>>> >>>>>> -------------------------------------------------------------------------- >>>>>> [compute-0-5.local:05226] 1 more process has sent help message >>>>>> help-mpi-errors.txt / mpi_errors_are_fatal >>>>>> [compute-0-5.local:05226] Set MCA parameter >>>>>> "orte_base_help_aggregate" to 0 to see all help / error messages >>>>>> >>>>>> >>>>>> Note that I have been running QE 5.4 on 24 cpu on this same computer >>>>>> cluster without any issue. I am copying my input file at the end of this >>>>>> email. >>>>>> >>>>>> Any help with this would be greatly appreciated. >>>>>> Thank you in advance. >>>>>> >>>>>> All the best, >>>>>> Martina >>>>>> >>>>>> Martina Lessio >>>>>> Department of Chemistry >>>>>> Columbia University >>>>>> >>>>>> *Input file:* >>>>>> &control >>>>>> calculation = 'relax' >>>>>> restart_mode='from_scratch', >>>>>> prefix='MoTe2_bulk_opt_1', >>>>>> pseudo_dir = '/home/mlessio/espresso-5.4.0/pseudo/', >>>>>> outdir='/home/mlessio/espresso-5.4.0/tempdir/' >>>>>> / >>>>>> &system >>>>>> ibrav= 4, A=3.530, B=3.530, C=13.882, cosAB=-0.5, cosAC=0, >>>>>> cosBC=0, >>>>>> nat= 6, ntyp= 2, >>>>>> ecutwfc =60. >>>>>> occupations='smearing', smearing='gaussian', degauss=0.01 >>>>>> nspin =1 >>>>>> / >>>>>> &electrons >>>>>> mixing_mode = 'plain' >>>>>> mixing_beta = 0.7 >>>>>> conv_thr = 1.0d-10 >>>>>> / >>>>>> &ions >>>>>> / >>>>>> ATOMIC_SPECIES >>>>>> Mo 95.96 Mo_ONCV_PBE_FR-1.0.upf >>>>>> Te 127.6 Te_ONCV_PBE_FR-1.1.upf >>>>>> ATOMIC_POSITIONS {crystal} >>>>>> Te 0.333333334 0.666666643 0.625000034 >>>>>> Te 0.666666641 0.333333282 0.375000000 >>>>>> Te 0.666666641 0.333333282 0.125000000 >>>>>> Te 0.333333334 0.666666643 0.874999966 >>>>>> Mo 0.333333334 0.666666643 0.250000000 >>>>>> Mo 0.666666641 0.333333282 0.750000000 >>>>>> >>>>>> K_POINTS {automatic} >>>>>> 8 8 2 0 0 0 >>>>>> >>>>>> >>>>>> _______________________________________________ >>>>>> users mailing list >>>>>> [email protected] >>>>>> https://lists.quantum-espresso.org/mailman/listinfo/users >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, >>>>> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy >>>>> Phone +39-0432-558216, fax +39-0432-558222 >>>>> >>>>> _______________________________________________ >>>>> users mailing list >>>>> [email protected] >>>>> https://lists.quantum-espresso.org/mailman/listinfo/users >>>> >>>> >>>> >>>> -- >>>> Martina Lessio, Ph.D. >>>> Frontiers of Science Lecturer in Discipline >>>> Postdoctoral Research Scientist >>>> Department of Chemistry >>>> Columbia University >>>> >>>> _______________________________________________ >>>> users mailing list >>>> [email protected] >>>> https://lists.quantum-espresso.org/mailman/listinfo/users >>>> >>> >>> >>> >>> -- >>> Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, >>> Univ. Udine, via delle Scienze 208, 33100 Udine, Italy >>> Phone +39-0432-558216, fax +39-0432-558222 >>> >>> _______________________________________________ >>> users mailing list >>> [email protected] >>> https://lists.quantum-espresso.org/mailman/listinfo/users >> >> >> >> -- >> Martina Lessio, Ph.D. >> Frontiers of Science Lecturer in Discipline >> Postdoctoral Research Scientist >> Department of Chemistry >> Columbia University >> >> _______________________________________________ >> users mailing list >> [email protected] >> https://lists.quantum-espresso.org/mailman/listinfo/users >> > > > > -- > Paolo Giannozzi, Dip. Scienze Matematiche Informatiche e Fisiche, > Univ. Udine, via delle Scienze 208, 33100 Udine, Italy > Phone +39-0432-558216, fax +39-0432-558222 > > _______________________________________________ > users mailing list > [email protected] > https://lists.quantum-espresso.org/mailman/listinfo/users -- Martina Lessio, Ph.D. Frontiers of Science Lecturer in Discipline Postdoctoral Research Scientist Department of Chemistry Columbia University
_______________________________________________ users mailing list [email protected] https://lists.quantum-espresso.org/mailman/listinfo/users
