Hi everyone,

I have recently been trying out the new version of QE (4.1.1) with different 
MPI libraries (mpich, openmpi) etc.  I recently came across a strange problem 
where a QE run for nickel test case (large k-mesh 48x48x48) would crash when I 
ran it in parallel across 5 nodes (10 processors) with openmpi-1.3.3 and Intel 
v11 compiles/MKL 10.2.  It appears that the hard drive on the last node becomes 
read-only and QE can no longer write to the local wavefunction files.  

After the run, the local scratch drive remains read only and I end up having to 
reboot the system to eliminate this problem.  I have been able to reproduce 
this problem on the same node.  However, when I remove this node from the list, 
QE runs fine.  Also, running QE with mpich2 doesn't have a problem on that node.

I suspect that it could be a hardware issue (harddrive close to dying perhaps) 
or an issue with openmpi, but I wanted to check to see anyone else has run into 
this problem while using QE.

For additional technical info, this is on a system with Redhat Enterprise 4, 2 
Xeon processors (3 GHz), 2GB ram.

Thanks,

Derek

 

################################
Derek Stewart, Ph. D.
Scientific Computation Associate
** New Webpage **
http://sites.google.com/site/dft4nano/
250 Duffield Hall
Cornell Nanoscale Facility (CNF)
Ithaca, NY 14853
stewart (at) cnf.cornell.edu
(607) 255-2856


Reply via email to