Vinod, A universal measure of performance for supercomputers is the number of floating point operation per second (flops) achieved for a real 64-bit precision matrix LU factorization and solve problem. This may be done on a MPI machine using the scalable linear algebra package ScaLAPACK put together by several institutions including the University of Knoxville, TN. The software package can be downloaded from many different sites, but the main site is:
http://www.netlib.org/scalapack/ Two principle routines in ScaLAPACK, PDGETRF (double precision factorization) and PDGETRS (double precision solve), are used for this measure on a 32-bit precision machine. The speed of the factorization and solve depends on different parameters used in the problem. There are test drivers in this software package set up to allow variations for these many parameters including the size of the matrix factored and the number of right-hand sides to solve. For your own comparison, a Cray X-MP in 1987 could factor and solve a 1000x1000 matrix with 1000 right-hand sides at approximately 220 Mflops (million floating point operations per second). Paul Paul Harten, Ph.D. Team Leader - P2Tools Design & Development Industrial Multimedia Branch Sustainable Technology Division National Risk Management Research Laboratory U.S. Environmental Protection Agency 26 West Martin Luther King Drive Cincinnati, Ohio 45268 T: 513-569-7045 F: 513-569-7471 E: [EMAIL PROTECTED] VINOD <[EMAIL PROTECTED]> Sent by: To: Kyndig Renshai <[EMAIL PROTECTED]> [EMAIL PROTECTED] cc: oscar users <[EMAIL PROTECTED]> ceforge.net Subject: RE: [Oscar-users] Tuning info on OSCAR Cluster Documentation ? 07/08/2002 04:59 AM Thank You Ren, What I understood form these , Performance tuning is of two different levels. 1. The Parallel program which works on the cluster. In fact I have found a number of documents on fine tuning pvmpov with skyvase.pov 2. But could you give some info on how to tune my Beowulf Cluster. I suppose this will include individual tuning of PBS, Maui and PVM ,MPICH etc. Is there any single utility/ test program for measuring the performance? If not a single one where can I find the individual tools for each package? Thanks Again, Vinod. -----Original Message----- From: Kyndig Renshai [mailto:[EMAIL PROTECTED]] Sent: Thursday, July 04, 2002 7:53 PM To: VINOD Subject: Re: [Oscar-users] Tuning info on OSCAR Cluster Documentation ? Ok you're going about this all wrong ... always research your problem before implementing it ... I'm no expert so bear with me while I try to say what I understand ... 1. A cluster is a group of computers that are linked by a logical structure -(its a software solution for connectivity and communication). That's all OSCAR does. They use schedulers (pbs and maui) to launch batch jobs. Batch jobs are not necessarily parallel just compute intensive. Schedulers try to optimize the turn around time for these types of jobs by allocating them to different members of the cluster to get them finished as quickly as possible. 2. pvmpov is a parallel pgm. Think of it this way - it has a master coordinator program on the master node - that splits up the job into smaller parts (divide and conquer algorithmn) and passes them off to the other processors. When each processor (or node) is done - they pass them back their portion of the job to the coordinator - which collates the data to create the final solution. your parallel pgm - needs the tuning. I'm not saying that clusters cannot be optimized. You parallel pgm (this particular style of processing - coordinator/subordinate) is limited by a few factors: the speed of the processors (if you mix processors speeds on the nodes for instance - the slowest processor can slow down the overall throughput. That is unless the coordinator has algorithmns to load balance ...) the channel speed - channel bandwidth and number of hops to the coordinator all pay a role in latency. (But we're talking local clusters here). There are a few models used in getting data out to the nodes each has its own limitations. If its totally distributed, peer to peer or some mixture (where there's messaging between coordinator and nodes or both - coordinator and node and node to node ) can affect what goes on in the channel and hence latency - and affect the overall time to complete the job. For something like povpvm granularity (size of the chunks being sent) into the channel (the smaller the pieces the more sends. The larger size there may be increase latency (queueing theory). pvmpov is not loadbalancing either - so if you have a mix of processors on the nodes - this might increase the turn around time on the slowest processor. -Synchronous sends for instance (wait until all processors are done doing a particular part of the mosaic of the job before continuing to the next part) will also determine the overall time to completion of the problem. There are documentation on how parallel pgms work - start with google.com - - beowulf.org is also the first place to look and check out the few educational sources. Ya I know you're not quite interested in the academics of parallel pgmming and more interested in how to optimize the hardware and OSCAR in particular. Understand the problem on both domains. For instance - if you look at Condor (condor.wisc.edu or do a google) - you'll get a completely different notion of how cluster can be put together - and a secondary notion of how clusters should work - its not about hardware so much as it is a logical configuration notion. All you really want is your pgm to be able to use the resources of other computers. There just happens to be a few problems with just allowin this kinda access across an entire network - Condor addresses these. Also take a look at grid computing (www.globus.org) and for an implementation grid-in-a-box (www.ncsa.uiuc.edu) (should be in their downloads section). OSCAR creates generic cluster - it only sets up the most basic infrastructure for cluster behavior of the beowulf style using component type integration to create the product. As you can see by the many questions concerning problems - component based software creation can be a challenging - especially if the designers do not also publish an architectural and design document that aids the user in figuring out how and why things do the things they do. The pgm is aimed at network administrators who understand linux and shell scripting all of which I'm sure you're probably pretty familiar with. I hope this helps. Ren VINOD wrote: Dear All, I am newbie. But was trying to hang on OSCAR for my cluster, for the last few weeks. Finally I could set up one successfully . Thanks to the detailed Documentation provided along with the software. Hats off to the team who has worked for this. I have a small request regarding the documentation. No where it is mentioning on how to tune my cluster for a better through put. As I have posted earlier I have done a benchmark with POVRAY -skyvase.pov. And I got a value of 8 seconds rendering time. Fine.. But how will know that I have a good performer in my hand? How will I tune it for better through put? what are the parameters to be tuned and taken care of? Could some body throw some light on these? Thanks Vinod. Do You Yahoo!? New! SBC Yahoo! Dial - 1st Month Free & unlimited access ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Oh, it's good to be a geek. http://thinkgeek.com/sf _______________________________________________ Oscar-users mailing list [EMAIL PROTECTED] https://lists.sourceforge.net/lists/listinfo/oscar-users
