I am trying to find out the speed of my cluster using HPL but I am not able
to understand what values to set in HPL.dat to find out the peak perfomance
(e.g. the values of N, NB, PxQ, etc). Kindly help me in this regard.
following is the HPL.dat I'm currently using as a load-generator
for my cluster's 8GB dual-socket-single-core nodes. it's not for
generating HPL scores, but rather just to stress the system.
comments:
- you choose the problem size to match your memory - too low a value
will result in not enough work per cpu and lower efficiency. on my
system, I found no significant advantage to using more than 1GB/proc,
but that should depend on the CPU and interconnect speed. (faster
cpus will need more work to amortize communication; faster communication
will lower the amount of work to amortize.)
- I didn't find any strong dependence on NB.
- P*Q=ncpus; for a switched interconnect, conventional wisdom is that
you want PxQ to be close to square. on my machine (full-bisection
quadrics with dual-processor nodes) I think I've measured it being
slightly faster when run in a 1:2 shape (Q ~= 2P).
- I haven't found any strong performance dependency on any of the
other parameters, but other clusters may be different if they have
slower or non-flat networks, more procs/node, etc.
regards, mark hahn.
HPLinpack benchmark input file
Innovative Computing Laboratory, University of Tennessee
HPL.out output file name (if any)
6 device out (6=stdout,7=stderr,file)
5 # of problems sizes (N)
1000 31700 31700 31700 31700
1 # of NBs
200 NBs
0 PMAP process mapping (0=Row-,1=Column-major)
1 # of process grids (P x Q)
1 Ps
2 Qs
16.0 threshold
1 # of panel fact
1 PFACTs (0=left, 1=Crout, 2=Right)
1 # of recursive stopping criterium
4 NBMINs (>= 1)
1 # of panels in recursion
2 NDIVs
1 # of recursive panel fact.
1 RFACTs (0=left, 1=Crout, 2=Right)
1 # of broadcast
1 BCASTs (0=1rg,1=1rM,2=2rg,3=2rM,4=Lng,5=LnM)
1 # of lookahead depth
1 DEPTHs (>=0)
2 SWAP (0=bin-exch,1=long,2=mix)
64 swapping threshold
0 L1 in (0=transposed,1=no-transposed) form
0 U in (0=transposed,1=no-transposed) form
1 Equilibration (0=no,1=yes)
8 memory alignment in double (> 0)
_______________________________________________
Beowulf mailing list, [email protected]
To change your subscription (digest mode or unsubscribe) visit
http://www.beowulf.org/mailman/listinfo/beowulf