[molpro-user] parallel molpro behavior

Sigismondo Boschi Tue, 31 Aug 2004 04:16:02 -0700

Hi,

I am running for the very first times molpro in parallel on an IBM regatta (p690) system with colony switch.

I have used Global Arrays 3.3.1 (GA, TARGET=LAPI64), and it seems to run fine.

However now I want to get as much as possibile from it and I have many obscure points.

The typical target systems of our users are CCSD(T) optimnization.

Running one of them, I would like to use as much as possibile the memory for integrals, and then, if not possibile, the disk, that is a shared GPFS filesystem: there is no benefit in concurrent accesses to it, and it became the bottleneck for the code.

With typical options I have found that the task don't use much memory, nor standard neither in GA.

Running on 16 cpus, At the very beginning of the output I found:


**********
ARMCI configured for 2 cluster nodes

MPP nodes nproc sp154 8 sp152 8 ga_uses_ma=false, calling ma_init with nominal heap. Any -G option will be ignored.

 Primary working directories:    /scratch_ssa/abc0
 Secondary working directories:  /scratch_ssa/abc0

 blaslib=default

MPP tuning parameters: Latency= 84 Microseconds, Broadcast speed= 233 MB/sec default implementation of scratch files=ga **********

Only if I use one task (or maybe, one node) I can find ga_uses_ma=true on the other side the statement: "default implementation of scratch files=ga" would let me think that they are "in-memory files"... however what happend at run-time does not correspond to it:

In fact I observe a lot of I/O, and the used memory is about 200 MB (of 2GB) for each task.

After the CCSD I get:
 DISK USED  *         9.10 GB
 GA USED    *       120.58 MB (max)        .00 MB (current)

And actually I set in the beginning:
 memory,200,M

(that is not the GA memory, but the -G option is ignored... I do not understand why).

Can anybody of you explain some of these facts, and give some suggestion for parallel runs?

For istance I tried also direct calculations, but:
1. it was very slow
2. it terminates with the error:

******
FILE 5 RECORD    1380 OFFSET=          0. NOT FOUND

 Records on file 5

IREC NAME TYPE OFFSET LENGTH IMPLEMENTATION EXT PREV PARENT MPP_STATE 1 4000 4096. 21301. df 0 0 0 1 2 4001 25397. 166404. df 0 0 0 1 3 4002 191801. 10725. df 0 0 0 0 4 4003 202526. 178782. df 0 0 0 1 5 35020 381308. 10496. df 0 0 0 1 6 3600 391804. 273. df 0 0 0 1 7 3601 392077. 273. df 0 0 0 1 8 35000 392350. 10. df 0 0 0 1 9 35001 392360. 10. df 0 0 0 1 10 35010 392370. 320. df 0 0 0 1 11 35011 392690. 320. df 0 0 0 1 12 7005 393010. 314964. df 0 0 0 1 13 8005 707974. 314964. df 0 0 0 1 14 9101 1022938. 9567696. df 0 0 0 0 15 9103 10590634. 9567696. df 0 0 0 0

 ? Error
 ? Record not found
 ? The problem occurs in readm

 ERROR EXIT
 CURRENT STACK:      CIPRO  MAIN
*******

Many Thanks for any help

  Regards


    Sigismondo Boschi


--
Sigismondo Boschi, Ph.D.               tel: +39 051 6171559
CINECA (High Performance Systems)      fax: +39 051 6137273 - 6132198
via Magnanelli, 6/3                    http://instm.cineca.it
40033 Casalecchio di Reno (BO)-ITALY   http://www.cineca.it

[molpro-user] parallel molpro behavior

Reply via email to