I am trying to find out the speed of my cluster using HPL but I am not able
to understand what values to set in HPL.dat to find out the peak perfomance
(e.g. the values of N, NB, PxQ, etc). Kindly help me in this regard.

following is the HPL.dat I'm currently using as a load-generator
for my cluster's 8GB dual-socket-single-core nodes. it's not for generating HPL scores, but rather just to stress the system.

comments:
        - you choose the problem size to match your memory - too low a value
        will result in not enough work per cpu and lower efficiency.  on my
        system, I found no significant advantage to using more than 1GB/proc,
        but that should depend on the CPU and interconnect speed.  (faster
        cpus will need more work to amortize communication; faster communication
        will lower the amount of work to amortize.)

        - I didn't find any strong dependence on NB.

        - P*Q=ncpus; for a switched interconnect, conventional wisdom is that
        you want PxQ to be close to square.  on my machine (full-bisection
        quadrics with dual-processor nodes) I think I've measured it being
        slightly faster when run in a 1:2 shape (Q ~= 2P).

        - I haven't found any strong performance dependency on any of the
        other parameters, but other clusters may be different if they have
        slower or non-flat networks, more procs/node, etc.

regards, mark hahn.

HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out      output file name (if any)
6            device out (6=stdout,7=stderr,file)
5            # of problems sizes (N)
1000 31700 31700 31700 31700
1            # of NBs
200         NBs
0            PMAP process mapping (0=Row-,1=Column-major)
1            # of process grids (P x Q)
1       Ps
2       Qs
16.0         threshold
1            # of panel fact
1            PFACTs (0=left, 1=Crout, 2=Right)
1            # of recursive stopping criterium
4           NBMINs (>= 1)
1            # of panels in recursion
2            NDIVs
1            # of recursive panel fact.
1            RFACTs (0=left, 1=Crout, 2=Right)
1            # of broadcast
1            BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1            # of lookahead depth
1            DEPTHs (>=0)
2            SWAP (0=bin-exch,1=long,2=mix)
64          swapping threshold
0            L1 in (0=transposed,1=no-transposed) form
0            U  in (0=transposed,1=no-transposed) form
1            Equilibration (0=no,1=yes)
8            memory alignment in double (> 0)
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to