Thanks Nick, I have set ulimit to unlimited  and I used openmpi-1.8.4. It seems 
the problem is related to the cluster. I have asked the cluster-administrator 
for debugging. Thanks for your information.

 

Best,

 

Xiaoming

 

From: [email protected] [mailto:[email protected]] On Behalf Of 
Nick Papior
Sent: Saturday, October 31, 2015 2:56 PM
To: [email protected]
Subject: Re: [SIESTA-L] scf loop break

 

There seems not to be any problem in your arch make if openmpi is used and 
compiled with the intel compiler.

 

Else, you need to do debugging on the specific machine you have problems with, 
consult with your cluster-administrator to best debug this. But it could be 
related to some kind of memory leak (check that). And have you set ulimit?

Which openmpi version are you using? In my installation 1.8.7 had a memory 
leak, so refrain from using that version.

 

2015-10-31 15:09 GMT+01:00 Xiaoming Wang <[email protected] 
<mailto:[email protected]> >:

Dear siesta users and developers,

 

I have a problem when the scf loop seems going to be finished, the code 
crashed. See the following. However the same input works on another computer. 
So does anyone know what’s going on here? I attached my arch.make. Any comment 
is appreciated.

 

***************************************************************************

siesta:  106  -152559.7801  -152559.7799  -152559.8859  0.0000 -2.4612

siesta:  107  -152559.7801  -152559.7799  -152559.8859  0.0000 -2.4612

siesta:  108  -152559.7801  -152559.7800  -152559.8860  0.0000 -2.4612

siesta:  109  -152559.7801  -152559.7800  -152559.8860  0.0000 -2.4612

siesta:  110  -152559.7801  -152559.7800  -152559.8860  0.0000 -2.4612

siesta:  111  -152559.7801  -152559.7800  -152559.8860  0.0000 -2.4612

siesta:  112  -152559.7801  -152559.7800  -152559.8860  0.0000 -2.4612

siesta:  113  -152559.7801  -152559.7800  -152559.8860  0.0000 -2.4612

siesta:  114  -152559.7801  -152559.7800  -152559.8860  0.0000 -2.4612

--------------------------------------------------------------------------

orterun noticed that process rank 3 with PID 24733 on node comet-23-62 exited 
on signal 11 (Segmentation fault).

--------------------------------------------------------------------------

--------------------------------------------------------------------------

WARNING: A process refused to die despite all the efforts!

This process may still be running and/or consuming resources.

 

Host: comet-23-62

PID:  24759

 

--------------------------------------------------------------------------

IBRUN: Job ended with value 139 

 

 

Xiaoming Wang

Rutgers University





 

-- 

Kind regards Nick

Responder a