Perhaps a stupid reply... Can you try running ipcs when the code is running to see what shared memory segments and semaphores are being used? It won't solve your problem but it might shed some light on what is going on.
-----Original Message----- From: Beowulf [mailto:[email protected]] On Behalf Of Jörg Saßmannshausen Sent: 04 April 2016 22:29 To: Beowulf Mailinglist <[email protected]> Subject: [Beowulf] shared memory error Dear all, I was wondering whether somebody might be able to shed some light on this problem I am having with a chemistry code (GAMESS-US): DDI Process 15: semop return an error performing 1 operation(s) on semid 98307. semop errno=EINVAL. This sometimes happens when I need quite a bit of memory for the fortran code (1550000000 words). Originally I thought it has to do with the hardware I am running it on but meanwhile I found it all over the place, i.e. on some older Opterons and on some newer Ivy and Haswell CPUs. It is not quite reproducible, unfortunately. A run might work ok for a few days and then the problem kicks in and the logfile explodes from around 14 MB to 17 GB, or it might just work. Some system informations: I am running Debian Jessie with gcc / gfortran version 4.9.2-10. The nodes have 64 GB of RAM and 16 or 20 cores. As the shared memory default settings in Linux are not suitable for GAMESS (there is a note in the documentation), I am using these settings on the 64 GB RAM machines: kernel.shmmax = 6923000000 kernel.shmall = 25165824 kernel.shmmni = 32768 I got the feeling the problem lies burried in these settings but my knowledge here is not sufficient to solve the problem. Could somebody point me in the right direction here? All the best from London Jörg -- ************************************************************* Dr. Jörg Saßmannshausen, MRSC University College London Department of Chemistry 20 Gordon Street London WC1H 0AJ email: [email protected] web: http://sassy.formativ.net Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html ##################################################################################### Scanned by MailMarshal - M86 Security's comprehensive email content security solution. ##################################################################################### Any views or opinions presented in this email are solely those of the author and do not necessarily represent those of the company. Employees of XMA Ltd are expressly required not to make defamatory statements and not to infringe or authorise any infringement of copyright or any other legal right by email communications. Any such communication is contrary to company policy and outside the scope of the employment of the individual concerned. The company will not accept any liability in respect of such communication, and the employee responsible will be personally liable for any damages or other liability arising. XMA Limited is registered in England and Wales (registered no. 2051703). Registered Office: Wilford Industrial Estate, Ruddington Lane, Wilford, Nottingham, NG11 7EP _______________________________________________ Beowulf mailing list, [email protected] sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf
