Dear Seth,

Using a combination of dual xeon, redhat7.3, ifc7.0 and atlas I think I might have reproduced a similar race condition.

It went away when I used the mkl_ia32 blas library by editing CONFIG:

BLASLIB_p4="-L/opt/intel/mkl61/lib/32 -lmkl_ia32 -Wl,-rpath,/opt/intel/mkl61/lib/32 "
LAPACKLIB_p4="-L/opt/intel/mkl61/lib/32 -lmkl_lapack -Wl,-rpath,/opt/intel/mkl61/lib/32 "


(you might also need -lguide if you don't have -openmp) and doing "make"

or when I used the blas shipped with molpro by editing the CONFIG file thus:

FTCFLAGS="mpp eaf blas0"
BLASLIB=""
LAPACKLIB=""
BLASLIB_p4=""
LAPACKLIB_p4=""

Then doing:

rm lib/libmolpro.a
make


Can you test whether either intel's or molpro's blas/lapack fixes your problem.
Best wishes,
Nick



Dr Seth OLSEN wrote:
Hello Molpro-Users,

As outlined in previous communiques, I have been having no luck in getting 
molpro2002.6 to run on a Dual Xeon node with Fedora Core 2, either as the 
installed rpm or as a self-compiled version done with ifc7 or ifc8.  The 
problem is as follows.  After the integral sort, the process writes no more to 
output but becomes unkillable with 99.9%CPU and 1.0%Mem as given by 'top'.

In order to help diagnose the problem, I have turned the 'gprint,io,cpu' 
directive on in a given failing job (bccd_opt.test).  The following are the 
last lines written to output for that job with the io printing turned on:

EXTENDING RECORD 1300.1 BY 34949. WORDS TO 38820. IMPLEMENTATION=df EXTENSION 0
NUMBER OF SORTED TWO-ELECTRON INTEGRALS: 34949. BUFFER LENGTH: 32768
NUMBER OF SEGMENTS: 1 SEGMENT LENGTH: 34949 RECORD LENGTH: 524288
Memory used in sort: 0.59 MW
OPENW FILE 24 NAME=/scratch/root/eaf_T2400002627.TMP IMPLEMENTATION=eaf STATUS=scratch HANDLE= 2
OPEN EAF FILE 24 NAME= IMPLEMENTATION=eaf
CLOSEW FILE 21 NAME=eaf_T2100002627.TMP IMPLEMENTATION=eaf HANDLE= 1
CLOSE EAF FILE 21


To determine what files might be opened by molpro at the time that the program 
stops functioning, I issue a 'lsof | grep molpro' command while the program is 
running in it's 'unkillable' final status.  The following is the output of that 
command:

bash      2210     root  cwd    DIR        8,1    12288     295282 
/opt/molpro/testjobs
molpro    2624     root  cwd    DIR        8,2     4096    2796193 /scratch/root
molpro    2624     root  rtd    DIR        8,1     4096          2 /
molpro    2624     root  txt    REG        8,1    41552     491923 
/usr/local/lib/molpro-mpp-Linux-i686-i4-2002.6/molpro
molpro    2624     root  mem    REG        8,1  1455084      82119 
/lib/tls/libc-2.3.3.so
molpro    2624     root  mem    REG        8,1   106892     375519 
/lib/ld-2.3.3.so
molpro    2624     root    0u   CHR      136,1                   3 /dev/pts/1
molpro    2624     root    1u   CHR      136,1                   3 /dev/pts/1
molpro    2624     root    2u   CHR      136,1                   3 /dev/pts/1
molpro    2624     root    4u   REG        8,1      823      18013 
/tmp/tmpfuX2LXr (deleted)
parallel  2626     root  txt    REG        8,1    30180     491926 
/usr/local/lib/molpro-mpp-Linux-i686-i4-2002.6/parallel
parallel  2626     root    1u   REG        8,1    12113     295235 
/opt/molpro/testjobs/bccd_opt.out
molprop_2 2627     root  cwd    DIR        8,2     4096    2796193 /scratch/root
molprop_2 2627     root  rtd    DIR        8,1     4096          2 /
molprop_2 2627     root  txt    REG        8,1 19346064     491925 
/usr/local/lib/molpro-mpp-Linux-i686-i4-2002.6/molprop_2002_6_p4_tcgmsg.exe
molprop_2 2627     root  mem    REG        8,1    96248     375542 
/lib/libnsl-2.3.3.so
molprop_2 2627     root  mem    REG        8,1   106892     375519 
/lib/ld-2.3.3.so
molprop_2 2627     root  mem    REG        8,1  1455084      82119 
/lib/tls/libc-2.3.3.so
molprop_2 2627     root  mem    REG        8,1   214796      82121 
/lib/tls/libm-2.3.3.so
molprop_2 2627     root  mem    REG        8,1    43528     375552 
/lib/libnss_nis-2.3.3.so
molprop_2 2627     root  mem    REG        8,1    50944     375549 
/lib/libnss_files-2.3.3.so
molprop_2 2627     root    0u   REG        8,1      823      18013 
/tmp/tmpfuX2LXr (deleted)
molprop_2 2627     root    1u   REG        8,1    12113     295235 
/opt/molpro/testjobs/bccd_opt.out
molprop_2 2627     root    2u   CHR      136,1                   3 /dev/pts/1
molprop_2 2627     root    3u  IPv4       4464                 TCP 
sphinx128.giza:32846->sphinx128.giza:32844 (ESTABLISHED)
molprop_2 2627     root    4u   REG        8,1     1457      18014 
/tmp/forttempG1uyhO
molprop_2 2627     root    5u   REG        8,1       74      18015 
/tmp/forttempfYdJb0
molprop_2 2627     root    6u   REG        8,1        0      18016 
/tmp/forttemp2hFU5b
molprop_2 2627     root    7u   REG        8,1        0      18017 
/tmp/forttemp9p96Zn
molprop_2 2627     root    8u   REG        8,2  3006888    2796194 
/scratch/root/df_T0100002627.TMP (deleted)
molprop_2 2627     root    9u   REG        8,2   182344    2796195 
/scratch/root/df_T0200002627.TMP (deleted)
molprop_2 2627     root   10u   REG        8,2   182344    2796196 
/scratch/root/df_T0300002627.TMP (deleted)
molprop_2 2627     root   11u   REG        8,2        0    2796197 
/scratch/root/df_T0400002627.TMP (deleted)
molprop_2 2627     root   12r   REG        8,1   476967     491914 
/usr/local/lib/molpro-mpp-Linux-i686-i4-2002.6/libmol.index
molprop_2 2627     root   13u   REG        8,2  3428352    2796199 
/scratch/root/eaf_T2400002627.TMP (deleted)

So, it appears that the *.TMP files that molpro has most recently opened and 
closed are listed as deleted but still open.  I cannot find these files in the 
specified directory, which makes sense if they are deleted, but if they are 
deleted than how can they be currently open files?

Cheers,

Seth Olsen



ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms

Dr Seth Olsen, PhD
Postdoctoral Fellow, Computational Systems Biology Group
Centre for Computational Molecular Science
Chemistry Building,
The University of Queensland
Qld 4072, Brisbane, Australia

tel (617) 33653732
fax (617) 33654623
email: [EMAIL PROTECTED]
Web: www.ccms.uq.edu.au


ccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccmsccms




Reply via email to