Perhaps a stupid reply...

Can you try running  ipcs  when the code is running to see what shared memory 
segments and semaphores are being used?
It won't solve your problem but it might shed some light on what is going on.



-----Original Message-----
From: Beowulf [mailto:[email protected]] On Behalf Of Jörg 
Saßmannshausen
Sent: 04 April 2016 22:29
To: Beowulf Mailinglist <[email protected]>
Subject: [Beowulf] shared memory error

Dear all,

I was wondering whether somebody might be able to shed some light on this 
problem I am having with a chemistry code (GAMESS-US):

DDI Process 15: semop return an error performing 1 operation(s) on semid 98307.
semop errno=EINVAL.

This sometimes happens when I need quite a bit of memory for the fortran code
(1550000000 words). Originally I thought it has to do with the hardware I am 
running it on but meanwhile I found it all over the place, i.e. on some older 
Opterons and on some newer Ivy and Haswell CPUs.

It is not quite reproducible, unfortunately. A run might work ok for a few days 
and then the problem kicks in and the logfile explodes from around 14 MB to 17 
GB, or it might just work.

Some system informations: I am running Debian Jessie with gcc / gfortran 
version 4.9.2-10. The nodes have 64 GB of RAM and 16 or 20 cores.  As the 
shared memory default settings in Linux are not suitable for GAMESS (there is a 
note in the documentation), I am using these settings on the 64 GB RAM
machines:

kernel.shmmax = 6923000000
kernel.shmall = 25165824
kernel.shmmni = 32768

I got the feeling the problem lies burried in these settings but my knowledge 
here is not sufficient to solve the problem. Could somebody point me in the 
right direction here?

All the best from London

Jörg

--
*************************************************************
Dr. Jörg Saßmannshausen, MRSC
University College London
Department of Chemistry
20 Gordon Street
London
WC1H 0AJ

email: [email protected]
web: http://sassy.formativ.net

Please avoid sending me Word or PowerPoint attachments.
See http://www.gnu.org/philosophy/no-word-attachments.html
#####################################################################################
Scanned by MailMarshal - M86 Security's comprehensive email content security 
solution.
#####################################################################################
Any views or opinions presented in this email are solely those of the author 
and do not necessarily represent those of the company. Employees of XMA Ltd are 
expressly required not to make defamatory statements and not to infringe or 
authorise any infringement of copyright or any other legal right by email 
communications. Any such communication is contrary to company policy and 
outside the scope of the employment of the individual concerned. The company 
will not accept any liability in respect of such communication, and the 
employee responsible will be personally liable for any damages or other 
liability arising. XMA Limited is registered in England and Wales (registered 
no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, 
Wilford, Nottingham, NG11 7EP
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to