Dear Mina, the problems that you describe have different origins.
The first one is clearly related to the GPU implementation, and I kindly ask you, if possible, to share QE's input and output files within an issue on gitlab (here https://gitlab.com/QEF/q-e-gpu/-/issues ) to investigate further. The second problem is instead related to I/O and it's hard to understand if the issue is related to the code or to a failure of the parallel filesystem. By the way, I've experienced random problems with I/O on Marconi100 as well. Best regards, Pietro On 8/4/20 11:42 AM, Mina Taleblou wrote:
Dear all, I am running a genetic algorithm from ASE (Atomic Simulation Environment ) on Marconi100, using quantum espresso as the calculator. The code (main.py) and the calculator file (local_calc.py) are attached. 'main.py' submits 10 jobs in parallel, and jobs are randomly stopped with this error: pw.x: cudahook.cc:649: CUresult device_free_callback(CUdeviceptr): Assertion `cacheNode != __null' failed. Also, other errors occur randomly as well, like: FIO-F-204/CLOSE/unit=4/illegal use of a read-only file. I would appreciate your help. Mina Taleblou Department of Nanotechnology University of Trieste -- *Mina Taleblou* _______________________________________________ Quantum ESPRESSO is supported by MaX (http://www.max-centre.eu/quantum-espresso users mailing list [email protected] https://lists.quantum-espresso.org/mailman/listinfo/users
Firma il tuo 5 per mille all’Università di Parma e aiuta così i nostri studenti che vogliono realizzare un’esperienza di studio all’estero - Indica 00308780345 nella tua denuncia dei redditi. _______________________________________________ Quantum ESPRESSO is supported by MaX (www.max-centre.eu/quantum-espresso) users mailing list [email protected] https://lists.quantum-espresso.org/mailman/listinfo/users
