Hi,

I am running for the very first times molpro in parallel on an IBM regatta (p690) system with colony switch.

I have used Global Arrays 3.3.1 (GA, TARGET=LAPI64), and it seems to run fine.

However now I want to get as much as possibile from it and I have many obscure points.

The typical target systems of our users are CCSD(T) optimnization.


Running one of them, I would like to use as much as possibile the memory for integrals, and then, if not possibile, the disk, that is a shared GPFS filesystem: there is no benefit in concurrent accesses to it, and it became the bottleneck for the code.


With typical options I have found that the task don't use much memory, nor standard neither in GA.

Running on 16 cpus, At the very beginning of the output I found:


********** ARMCI configured for 2 cluster nodes

MPP nodes nproc
sp154 8
sp152 8
ga_uses_ma=false, calling ma_init with nominal heap. Any -G option will be ignored.


 Primary working directories:    /scratch_ssa/abc0
 Secondary working directories:  /scratch_ssa/abc0

 blaslib=default

MPP tuning parameters: Latency= 84 Microseconds, Broadcast speed= 233 MB/sec
default implementation of scratch files=ga
**********


Only if I use one task (or maybe, one node) I can find ga_uses_ma=true
on the other side the statement: "default implementation of scratch files=ga" would let me think that they are "in-memory files"... however what happend at run-time does not correspond to it:


In fact I observe a lot of I/O, and the used memory is about 200 MB (of 2GB) for each task.

After the CCSD I get:
 DISK USED  *         9.10 GB
 GA USED    *       120.58 MB (max)        .00 MB (current)

And actually I set in the beginning:
 memory,200,M

(that is not the GA memory, but the -G option is ignored... I do not understand why).


Can anybody of you explain some of these facts, and give some suggestion for parallel runs?


For istance I tried also direct calculations, but:
1. it was very slow
2. it terminates with the error:

******
FILE 5 RECORD    1380 OFFSET=          0. NOT FOUND

 Records on file 5

IREC NAME TYPE OFFSET LENGTH IMPLEMENTATION EXT PREV PARENT MPP_STATE
1 4000 4096. 21301. df 0 0 0 1
2 4001 25397. 166404. df 0 0 0 1
3 4002 191801. 10725. df 0 0 0 0
4 4003 202526. 178782. df 0 0 0 1
5 35020 381308. 10496. df 0 0 0 1
6 3600 391804. 273. df 0 0 0 1
7 3601 392077. 273. df 0 0 0 1
8 35000 392350. 10. df 0 0 0 1
9 35001 392360. 10. df 0 0 0 1
10 35010 392370. 320. df 0 0 0 1
11 35011 392690. 320. df 0 0 0 1
12 7005 393010. 314964. df 0 0 0 1
13 8005 707974. 314964. df 0 0 0 1
14 9101 1022938. 9567696. df 0 0 0 0
15 9103 10590634. 9567696. df 0 0 0 0


 ? Error
 ? Record not found
 ? The problem occurs in readm

 ERROR EXIT
 CURRENT STACK:      CIPRO  MAIN
*******

Many Thanks for any help

  Regards


Sigismondo Boschi




-- Sigismondo Boschi, Ph.D. tel: +39 051 6171559 CINECA (High Performance Systems) fax: +39 051 6137273 - 6132198 via Magnanelli, 6/3 http://instm.cineca.it 40033 Casalecchio di Reno (BO)-ITALY http://www.cineca.it



Reply via email to