Re: [OMPI users] lammps MD code fails with Open MPI 1.3
It's probably not the same issue as this is one of the very few codes that I maintain which is C++ and not fortran :-( It behaved similarly on another system when I built it against a new version (1.0??) of MVAPICH. I had to roll back a version from that as well. I may contact the lammps people and see if they know what's going on as well. Jeff F. Pummill Senior Linux Cluster Administrator TeraGrid Campus Champion - UofA University of Arkansas Fayetteville, Arkansas 72701 (479) 575 - 4590 http://hpc.uark.edu "In theory, there is no difference between theory and practice. But in practice, there is!" /-- anonymous/ Jeff Squyres wrote: Actually, there was a big Fortran bug that crept in after 1.3 that was just fixed on the trunk last night. If you're using Fortran applications with some compilers (e.g., Intel), the 1.3.1 nightly snapshots may have hung in some cases. The problem should be fixed in tonight's 1.3.1 nightly snapshot. On Feb 20, 2009, at 12:46 AM, Nysal Jan wrote: It could be the same bug reported here http://www.open-mpi.org/community/lists/users/2009/02/8010.php Can you try a recent snapshot of 1.3.1 (http://www.open-mpi.org/nightly/v1.3/) to verify if this has been fixed --Nysal On Thu, 2009-02-19 at 16:09 -0600, Jeff Pummill wrote: I built a fresh version of lammps v29Jan09 against Open MPI 1.3 which in turn was built with Gnu compilers v4.2.4 on an Ubuntu 8.04 x86_64 box. This Open MPI build was able to generate usable binaries such as XHPL and NPB, but the lammps binary it generated was not usable. I tried it with a couple of different versions of the lammps source, but to no avail. No errors during the builds and a binary was created, but when executing the job it quickly exits with no messages other than: jpummil@stealth:~$ mpirun -np 4 -hostfile hosts /home/jpummil/lmp_Stealth-OMPI < in.testbench_small LAMMPS (22 Jan 2008) Interestingly, I downloaded Open MPI 1.2.8, built it with the same configure options I had used with 1.3, and it worked. I'm getting by fine with 1.2.8. I just wanted to file a possible bug report on 1.3 and see if others have seen this behavior. Cheers! -- Jeff F. Pummill Senior Linux Cluster Administrator TeraGrid Campus Champion - UofA University of Arkansas ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] lammps MD code fails with Open MPI 1.3
I built a fresh version of lammps v29Jan09 against Open MPI 1.3 which in turn was built with Gnu compilers v4.2.4 on an Ubuntu 8.04 x86_64 box. This Open MPI build was able to generate usable binaries such as XHPL and NPB, but the lammps binary it generated was not usable. I tried it with a couple of different versions of the lammps source, but to no avail. No errors during the builds and a binary was created, but when executing the job it quickly exits with no messages other than: jpummil@stealth:~$ mpirun -np 4 -hostfile hosts /home/jpummil/lmp_Stealth-OMPI < in.testbench_small LAMMPS (22 Jan 2008) Interestingly, I downloaded Open MPI 1.2.8, built it with the same configure options I had used with 1.3, and it worked. I'm getting by fine with 1.2.8. I just wanted to file a possible bug report on 1.3 and see if others have seen this behavior. Cheers! -- Jeff F. Pummill Senior Linux Cluster Administrator TeraGrid Campus Champion - UofA University of Arkansas //
Re: [OMPI users] MPI-2 Supported on Open MPI 1.2.5?
Haha, yeah we found out about that one when trying to run Linpack with threaded BLAS implementations. On the MPI-2 note, anyone running MATLAB and the parallel toolkit under Open MPI? They are unreasonably obscure about what MPI they need although I do believe they need MPI-2 functions. If it will work with Open MPI, and is not a nightmare to set up, I may try it as some of my users would be elated. If there are excessive problems, I may opt for SciLab which is supposed to be an "equivalent" and open source. Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Fayetteville, Arkansas 72701 (479) 575 - 4590 http://hpc.uark.edu "In theory, there is no difference between theory and practice. But in practice, there is!" /-- anonymous/ Brian Budge wrote: One small (or to some, not so small) note is that full multi-threading with OpenMPI is very unlikely to work with infiniband right now. Brian On Mon, Mar 10, 2008 at 6:24 AM, Michael <mk...@ieee.org <mailto:mk...@ieee.org>> wrote: Quick answer, till you get a complete answer, Yes, OpenMPI has long supported most of the MPI-2 features. Michael On Mar 7, 2008, at 7:44 AM, Jeff Pummill wrote: > Just a quick question... > > Does Open MPI 1.2.5 support most or all of the MPI-2 directives and > features? > > I have a user who specified MVAPICH2 as he needs some features like > extra task spawning, but I am trying to standardize on Open MPI > compiled against Infiniband for my primary software stack. > > Thanks! > > -- > Jeff F. Pummill > Senior Linux Cluster Administrator > University of Arkansas ___ users mailing list us...@open-mpi.org <mailto:us...@open-mpi.org> http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] MPI-2 Supported on Open MPI 1.2.5?
Just a quick question... Does Open MPI 1.2.5 support most or all of the MPI-2 directives and features? I have a user who specified MVAPICH2 as he needs some features like extra task spawning, but I am trying to standardize on Open MPI compiled against Infiniband for my primary software stack. Thanks! -- Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas //
Re: [OMPI users] openMPI on 64 bit SUSE 10.2 OS
Is it possible that this could be a problem with /usr/lib64 as opposed to /usr/lib? Just a thought... Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas // Hsieh, Pei-Ying (MED US) wrote: Hi, Edgar and Galen, Thanks for the quick reply! What puzzles me is that, on 32 bit OpenSUSE, I was able to compile elmer solver without any issue using the same script. I am planning to use HYPRE library, but, configure file indicated that it cannot find hypre either which is another puzzle to me. I will look into this further. Best, pei -Original Message- From: users-boun...@open-mpi.org [mailto:users-boun...@open-mpi.org] On Behalf Of Edgar Gabriel Sent: Tuesday, February 12, 2008 4:28 PM To: Open MPI Users Subject: Re: [OMPI users] openMPI on 64 bit SUSE 10.2 OS I doubt that this has to do anything with the platform. We are running here Open MPI on a 64bit architecture using SuSe 10.2 since quite a while successfully. However, you configure log is indicating, the parpack could not be found, so you might have to change the CFLAGS and LDFLAGS in order for you configure script to find the according library. Hsieh, Pei-Ying (MED US) wrote: configure: error: The MPI version needs parpack. Disabling MPI. peiying@saturn:~/elmer/elmer-5.4.0/fem-5.4.0> Thanks Edgar ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users This message and any included attachments are from Siemens Medical Solutions and are intended only for the addressee(s). The information contained herein may include trade secrets or privileged or otherwise confidential information. Unauthorized review, forwarding, printing, copying, distributing, or using such information is strictly prohibited and may be unlawful. If you received this message in error, or have reason to believe you are not authorized to receive it, please promptly delete this message and notify the sender by e-mail with a copy to central.securityoff...@siemens.com Thank you ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] flash2.5 with openmpi
I'm guessing he means the ASC FLASH code which simulates star explosions... Brock? Jeff F. Pummill University of Arkansas // Doug Reeder wrote: Brock, Do you mean flash memory, like a USB memory stick. What kid of file system is on the memory. Is there some filesystem limit you are bumping into. Doug Reeder On Jan 25, 2008, at 8:38 AM, Brock Palen wrote: Is anyone using flash with openMPI? we are here, but when ever it tries to write its second checkpoint file it segfaults once it gets to 2.2GB always in the same location. Debugging is a pain as it takes 3 days to get to that point. Just wondering if anyone else has seen this same behavior. Brock Palen Center for Advanced Computing bro...@umich.edu (734)936-1985 ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] Tracing the library using gdb and xterm
Krishna, When you log in to the remote system, use ssh -X or ssh -Y which will export the xterm back thru the ssh connection. Jeff Pummill University of Arkansas Krishna Chaitanya wrote: Hi, I have been tracing the interactions between the PERUSE and MPI library,on one machine. I have been using gdb along with xterm to have two windows open at the same time as I step through the code. I wish to get a better glimpse of the working of the point to point calls, by launching the job on two machines and by tracing the flow in a similar manner. This is where I stand as of now : mpirun --prefix /usr/local -hostfile machines -np 2 xterm -e gdb peruse_ex1 xterm Xt error: Can't open display: xterm: DISPLAY is not set I tried using the display option for xterm and setting the value as 0.0, that was not of much help. If someone can guide me as to where the DISPLAY parameter has to be set to allow the remote machine to open the xterm window, it will be of great help. Thanks, Krishna -- In the middle of difficulty, lies opportunity ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] "Hostfile" on Multicore Node?
So, it appears that for a machine of this type (dual quad core cpu's), this approach would be correct for my tests... [jpummill@n1 bin]$ more my-hosts n1 slots=8 max_slots=8 and subsequently, launch two jobs in this configuration... /home/jpummill/openmpi-1.2.2/bin/mpirun --hostfile my-hosts -np 4 --byslot ./cg.C.4 It appears that this does avoid oversubscribing any particular core as I am not exceeding my core count by running just the two jobs requiring 4 cores each. Thanks, Jeff Pummill George Bosilca wrote: The cleaner way to define such an environment is by using the max-slots and/or slots options in the hostfile. Here is a FAQ entry about how Open MPI deal with these options (http://www.open-mpi.org/faq/?category=running#mpirun-scheduling). george. On Oct 26, 2007, at 10:52 AM, Jeff Pummill wrote: I am doing some testing on a variety of 8-core nodes in which I just want to execute a couple of executables and have them distributed to the available cores without overlapping. Typically, this would be done with a parameter like -machinefile machines, but I have no idea what names to put into the machines file as this is a single node with two quad core cpu's. As I am launching the jobs sans scheduler, I need to specify what cores to run on I would think to keep from overscheduling some cores while others receive nothing to do at all. Simple suggestions? Maybe Open MPI takes care of this detail for me? Thanks! Jeff Pummill ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
[OMPI users] "Hostfile" on Multicore Node?
I am doing some testing on a variety of 8-core nodes in which I just want to execute a couple of executables and have them distributed to the available cores without overlapping. Typically, this would be done with a parameter like /-machinefile machines/, but I have no idea what names to put into the /machines/ file as this is a single node with two quad core cpu's. As I am launching the jobs sans scheduler, I need to specify what cores to run on I would think to keep from overscheduling some cores while others receive nothing to do at all. Simple suggestions? Maybe Open MPI takes care of this detail for me? Thanks! Jeff Pummill
Re: [OMPI users] SLURM vs. Torque?
SLURM was really easy to build and install, plus it's a project of LLNL and I love stuff that the Nat'l Labs architect. The SLURM message board is also very active and quick to respond to questions and problems. Jeff F. Pummill Bill Johnstone wrote: Hello All. We are starting to need resource/scheduling management for our small cluster, and I was wondering if any of you could provide comments on what you think about Torque vs. SLURM? On the basis of the appearance of active development as well as the documentation, SLURM seems to be superior, but can anyone shed light on how they compare in use? I realize the truth in the stock answer of "it depends on what you need/want," but as of yet we are not experienced enough with this kind of thing to have a set of firm requirements. At this point, we can probably adapt our workflow/usage a little bit to accomodate the way the resource manager works. And of course we'll be using OpenMPI with whatever resource manager we go with. Anyway, enough from me -- I'm looking to hear other's experiences and viewpoints. Thanks for any input! __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] OpenMPI Documentation?
Jeff, Count us in at the UofA. My initial impressions of Open MPI are very good and I would be open to contributing to this effort as time allows. Thanks! Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Fayetteville, Arkansas 72701 (479) 575 - 4590 http://hpc.uark.edu "A supercomputer is a device for turning compute-bound problems into I/O-bound problems." -Seymour Cray Jeff Squyres wrote: So there are at least a few people who are interested in this effort (keep chiming in if you are interested so that we can get a tally of who would like to be involved). What kind of resources / organization would be useful for this group? Indiana University graciously hosts all of Open MPI's electronic resources (Subversion, web site, bug tracking, DNS, mailing lists, ...) and I certainly can't speak for them, but if we ask nicely, I'd be willing to bet that they would add some hosting services for a documentation project (if such additional resources would be helpful, of course). I would also be happy to host a teleconference if talking about all this start/admin stuff for an hour would save 1-2 weeks worth of detailed e-mails. - The only current documentation we have is: - the web FAQ - the README in the tarball What is conspicuously missing is a nice PDF and/or HTML tarball with comprehensive documentation. But I think that FAQ/README also fit into the general category of documentation, so it might make sense to put all 3 of these items under the control of one group. The obvious rationale here is that all three could stay in tighter sync if there's one group monitoring all 3. One point worth mentioning: Open MPI is all about community consensus, but "s/he who implements usually wins". :-) So if we get an active group working on documentation, the FAQ could be totally re- done if the group so decides (for example). All this being said, the OMPI developers *have* talked about documentation a bit over time. Here's some of the points from prior discussions, in no particular order: - It highly desirable to have documentation that can be output in multiple different forms (PDF, HTML, ...whatever). If possible, the docs should be shipped in distribution tarballs and hosted on the OMPI web site. - LAM/MPI had two great docs: one for installation LAM/MPI and one for using LAM/MPI. These might be good example documents for what Open MPI might want to do (see http://www.lam-mpi.org/using/docs/), regardless of the back-end technology used to generate the docs. Source LaTeX for these guides are available if it would be helpful (I wrote most of them). - It would be most helpful if the documentation is written in a tool that has free editors, preferably cross-platform and available in multiple POSIX-like environments (Solaris, Linux, OS X). MS Office was explicitly rejected because of its requirement for Windows/OS X (other Office clones were not really discussed). LaTeX was discussed but wasn't favored due to the steep learning curve and general lack of experience with it outside of academia. - First documentation should be aimed towards users. Developer documentation might follow. - Once upon a time, we developers started to use doxygen for documentation, but it has proven to be lousy for book-like entities (IONSHO). Doxygen is decent for code documentation, but not documents. - A few recent discussions about documentation came to the conclusion that Docbook (www.docbook.org) looked promising, but we didn't get deep into details / investigating the feasibility. One obvious Big Project using Docbook is Subversion (see http://svnbook.red- bean.com/). Docbook-produced HTML and PDF seem to look both pretty and functional. - It would also be nice if sub-distributions of Open MPI could take the documentation and -- in some defined automated fashion -- be able to do the following: - insert their own "chapters" or "sections" that are specific to that sub-distribution (e.g., Sun ClusterTools have some Solaris- specific stuff, OFED have some OpenFabrics-specific stuff, etc.) - remove/"turn off" specific sections of documentation (e.g., OFED would likely not include any documentation about Myricom networks [and vice versa]) This would go a long ways towards being able to keep the community documentation in sync with docs included in targeted/vendor OMPI releases. - The OMPI web site is almost entirely written in PHP and is mirrored around the world. It would be *strongly* preferred if the web-site hosting of the docs is fully mirror-able (because assumedly docs are one of the things that users would want to browse the most). Hence, requiring a new kind of server other than HTML/PHP would require very, very strong rationale. :-) - The technology of choice for displaying on the web site is PHP. But that still leaves open a wide variety
[OMPI users] MVAPI Options on Job Submission
I have successfully compiled Open MPI 1.2.3 against Intel 8.1 compiler suite and old (3 years) mvapi stack using the following configure: configure --prefix=/nfsutil/openmpi-1.2.3 --with-mvapi=/usr/local/topspin/ CC=icc CXX=icpc F77=ifort FC=ifort Do I need to assign any particular flags to the command line submission to ensure that it is using the IB network instead of the TCP? Or possibly disable the Gig-E with ^tcp to see if it still runs successfully? I just want to be sure that Open MPI is actually USING the IB network and mvapi. Thanks! Jeff Pummill
[OMPI users] Building OMPI with dated tools & libs
Good morning all, I have been very impressed so far with OpenMPI on one of our smaller clusters running Gnu compilers and Gig-E interconnects, so I am considering a build on our large cluster. The potential problem is that the compilers are Intel 8.1 versions and the Infiniband is supported by three year old Topspin (now Cisco) drivers and libraries. Basically, this is a cluster that runs a very heavy workload using MVAPICH, thus we have adopted the "if it ain't broke, don't fix it" methodology...thus all of the drivers, libraries, and compilers are approximately 3 years old. Would it be reasonable to expect OpenMPI 1.2.3 to build and run in such an environment? Thanks! Jeff Pummill University of Arkansas
Re: [OMPI users] OpenMPI / SLURM Job Issues
Hey Jeff, Finally got my test nodes back and was looking at the info you sent. On the SLURM page, it states the following: *Open MPI* <http://www.open-mpi.org/> relies upon SLURM to allocate resources for the job and then mpirun to initiate the tasks. When using salloc command, mpirun's -nolocal option is recommended. For example: $ salloc -n4 sh# allocates 4 processors and spawns shell for job mpirun -np 4 -nolocal a.out exit # exits shell spawned by initial salloc command You are saying that I need to use the slurm salloc, then pass SLURM a script? Or could I just add it all into the script? Fro eaample: #!/bin/sh salloc -n4 mpirun my_mpi_application Then, run with srun -b myscript.sh Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Fayetteville, Arkansas 72701 (479) 575 - 4590 http://hpc.uark.edu "A supercomputer is a device for turning compute-bound problems into I/O-bound problems." -Seymour Cray Jeff Squyres wrote: Ick; I'm surprised that we don't have this info on the FAQ. I'll try to rectify that shortly. How are you launching your jobs through SLURM? OMPI currently does not support the "srun -n X my_mpi_application" model for launching MPI jobs. You must either use the -A option to srun (i.e., get an interactive SLURM allocation) or use the -b option (submit a script that runs on the first node in the allocation). Your script can be quite short: #!/bin/sh mpirun my_mpi_application Note that OMPI will automatically figure out how many cpu's are in your SLURM allocation, so you don't need to specify "-np X". Hence, you can run the same script without modification no matter how many cpus/nodes you get from SLURM. It's on the long-term plan to get "srun -n X my_mpi_application" model to work; it just hasn't bubbled up high enough in the priority stack yet... :-\ On Jun 20, 2007, at 1:59 PM, Jeff Pummill wrote: Just started working with OpenMPI / SLURM combo this morning. I can successfully launch this job from the command line and it runs to completion, but when launching from SLURM they hang. They appear to just sit with no load apparent on the compute nodes even though SLURM indicates they are running... [jpummil@trillion ~]$ sinfo -l Wed Jun 20 12:32:29 2007 PARTITION AVAIL TIMELIMIT JOB_SIZE ROOT SHARE GROUPS NODES STATE NODELIST debug* up infinite 1-infinite nonoall 8 allocated compute-1-[1-8] debug* up infinite 1-infinite nonoall 1idle compute-1-0 [jpummil@trillion ~]$ squeue -l Wed Jun 20 12:32:20 2007 JOBID PARTITION NAME USERSTATE TIME TIMELIMIT NODES NODELIST(REASON) 79 debug mpirun jpummil RUNNING 5:27 UNLIMITED 2 compute-1-[1-2] 78 debug mpirun jpummil RUNNING 5:58 UNLIMITED 2 compute-1-[3-4] 77 debug mpirun jpummil RUNNING 7:00 UNLIMITED 2 compute-1-[5-6] 74 debug mpirun jpummil RUNNING 11:39 UNLIMITED 2 compute-1-[7-8] Are there any known issues of this nature involving OpenMPI and SLURM? Thanks! Jeff F. Pummill ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users
Re: [OMPI users] f90 support not built with gfortran?
Thanks guys! Setting F77=gfortran did the trick. Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Fayetteville, Arkansas 72701 (479) 575 - 4590 http://hpc.uark.edu "A supercomputer is a device for turning compute-bound problems into I/O-bound problems." -Seymour Cray Jeff Squyres wrote: On Jun 12, 2007, at 5:56 AM, Terry Frankcombe wrote: I downloaded and configured v1.2.2 this morning on an Opteron cluster using the following configure directives... ./configure --prefix=/share/apps CC=gcc CXX=g++ F77=g77 FC=gfortran CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64 What does config.log say? (Look for 'Fortran 90'.) config.log should be your first port of call when trying to debug build problems in any "configure"-d project. Exactly. OMPI's configure probably determined that it should not build the F90 bindings, so it didn't (hence, mpif90 is non- functional). If I had to guess, it's because you specified both g77 and gfortran. When using gfortran, you should probably use it for both F77 and FC. That will likely fix your problem. If it doesn't, please see this web page for more details on getting help: http://www.open-mpi.org/community/help/ Consider this a compile-time problem (because OMPI decided not to build the F90 bindings) and send all the information listed. Thanks!
[OMPI users] f90 support not built with gfortran?
Greetings all, I downloaded and configured v1.2.2 this morning on an Opteron cluster using the following configure directives... ./configure --prefix=/share/apps CC=gcc CXX=g++ F77=g77 FC=gfortran CFLAGS=-m64 CXXFLAGS=-m64 FFLAGS=-m64 FCFLAGS=-m64 Compilation seemed to go OK and there IS an mpif90 option in /bin..but it gives me the following error when I try to compile my source file: /share/apps/bin/mpif90 -c -I/share/apps/include -O3 ft.f Unfortunately, this installation of Open MPI was not compiled with Fortran 90 support. As such, the mpif90 compiler is non-functional. I am certain that gfortran is installed and working correctly as I tested compilation of a small piece of serial code with it. Something I am doing wrong? -- Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Fayetteville, Arkansas 72701
Re: [OMPI users] Library Definitions
Glad to contribute Victor! I am running on a home workstation that uses an AMD 3800 cpu attached to 2 gigs of ram. My timings for FT were 175 secs with one core and 110 on two cores with -O3 and -mtune=amd64 as tuning options. Brock, Terry and Jeff are all exactly correct in their comments regarding benchmarks. There are simply too many variables to contend with. In addition, one and two core runs on a single workstation probably isn't the best evaluation of OpenMPI. As you expand to more devices and generate bigger problems (HPL or HPCC for example), a better overall picture will emerge. Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas victor marian wrote: Thank you everybody for the advices. I ran the NAS benchmark class B and it runs in 181 seconds on one core and in 90 seconds on two cores, so it scales almost perfectly. What were your timings, Jeff, and what processor do you exactly have? Mine is a Pentium D at 2.8GHz. Victor --- Jeff Pummill <jpum...@uark.edu> wrote: Victor, Build the FT benchmark and build it as a class B problem. This will run in the 1-2 minute range instead of 2-4 seconds the CG class A benchmark does. Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Terry Frankcombe wrote: Hi Victor I'd suggest 3 seconds of CPU time is far, far to small a problem to do scaling tests with. Even with only 2 CPUs, I wouldn't go below 100 times that. On Mon, 2007-06-11 at 01:10 -0700, victor marian wrote: Hi Jeff I ran the NAS Parallel Bechmark and it gives for me -bash%/export/home/vmarian/fortran/benchmarks/NPB3.2/NPB3.2-MPI/bin$ mpirun -np 1 cg.A.1 -- [0,1,0]: uDAPL on host SERVSOLARIS was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -- NAS Parallel Benchmarks 3.2 -- CG Benchmark Size: 14000 Iterations:15 Number of active processes: 1 Number of nonzeroes per row: 11 Eigenvalue shift: .200E+02 Benchmark completed VERIFICATION SUCCESSFUL Zeta is 0.171302350540E+02 Error is 0.512264003323E-13 CG Benchmark Completed. Class =A Size=14000 Iterations = 15 Time in seconds = 3.02 Total processes =1 Compiled procs =1 Mop/s total = 495.93 Mop/s/process = 495.93 Operation type = floating point Verification= SUCCESSFUL Version = 3.2 Compile date= 11 Jun 2007 -bash%/export/home/vmarian/fortran/benchmarks/NPB3.2/NPB3.2-MPI/bin$ mpirun -np 2 cg.A.2 -- [0,1,0]: uDAPL on host SERVSOLARIS was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -- -- [0,1,1]: uDAPL on host SERVSOLARIS was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -- NAS Parallel Benchmarks 3.2 -- CG Benchmark Size: 14000 Iterations:15 Number of active processes: 2 Number of nonzeroes per row: 11 Eigenvalue shift: .200E+02 Benchmark completed VERIFICATION SUCCESSFUL Zeta is 0.171302350540E+02 Error is 0.522633719989E-13 CG Benchmark Completed. Class =A Size=14000 Iterations = 15 Time in seconds = 2.47 Total processes =2 Compiled procs =2 Mop/s total = 606.32 Mop/s/process = 303.16 Operation type = floating point Verification= SUCCESSFUL Version = 3.2 Compile date= 11 Jun 2007 You can remark that the scalling is not so good like yours. Maibe I am having comunications problems between processors. You can also remark that I am faster on one process concared to your processor.
Re: [OMPI users] Library Definitions
Victor, Build the FT benchmark and build it as a class B problem. This will run in the 1-2 minute range instead of 2-4 seconds the CG class A benchmark does. Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Terry Frankcombe wrote: Hi Victor I'd suggest 3 seconds of CPU time is far, far to small a problem to do scaling tests with. Even with only 2 CPUs, I wouldn't go below 100 times that. On Mon, 2007-06-11 at 01:10 -0700, victor marian wrote: Hi Jeff I ran the NAS Parallel Bechmark and it gives for me -bash%/export/home/vmarian/fortran/benchmarks/NPB3.2/NPB3.2-MPI/bin$ mpirun -np 1 cg.A.1 -- [0,1,0]: uDAPL on host SERVSOLARIS was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -- NAS Parallel Benchmarks 3.2 -- CG Benchmark Size: 14000 Iterations:15 Number of active processes: 1 Number of nonzeroes per row: 11 Eigenvalue shift: .200E+02 Benchmark completed VERIFICATION SUCCESSFUL Zeta is 0.171302350540E+02 Error is 0.512264003323E-13 CG Benchmark Completed. Class =A Size=14000 Iterations = 15 Time in seconds = 3.02 Total processes =1 Compiled procs =1 Mop/s total = 495.93 Mop/s/process = 495.93 Operation type = floating point Verification= SUCCESSFUL Version = 3.2 Compile date= 11 Jun 2007 -bash%/export/home/vmarian/fortran/benchmarks/NPB3.2/NPB3.2-MPI/bin$ mpirun -np 2 cg.A.2 -- [0,1,0]: uDAPL on host SERVSOLARIS was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -- -- [0,1,1]: uDAPL on host SERVSOLARIS was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. -- NAS Parallel Benchmarks 3.2 -- CG Benchmark Size: 14000 Iterations:15 Number of active processes: 2 Number of nonzeroes per row: 11 Eigenvalue shift: .200E+02 Benchmark completed VERIFICATION SUCCESSFUL Zeta is 0.171302350540E+02 Error is 0.522633719989E-13 CG Benchmark Completed. Class =A Size=14000 Iterations = 15 Time in seconds = 2.47 Total processes =2 Compiled procs =2 Mop/s total = 606.32 Mop/s/process = 303.16 Operation type = floating point Verification= SUCCESSFUL Version = 3.2 Compile date= 11 Jun 2007 You can remark that the scalling is not so good like yours. Maibe I am having comunications problems between processors. You can also remark that I am faster on one process concared to your processor. Victor --- Jeff Pummill <jpum...@uark.edu> wrote: Perfect! Thanks Jeff! The NAS Parallel Benchmark on a dual core AMD machine now returns this... [jpummil@localhost bin]$ mpirun -np 1 cg.A.1 NAS Parallel Benchmarks 3.2 -- CG Benchmark CG Benchmark Completed. Class =A Size=14000 Iterations = 15 Time in seconds = 4.75 Total processes =1 Compiled procs =1 Mop/s total = 315.32 ...and... [jpummil@localhost bin]$ mpirun -np 2 cg.A.2 NAS Parallel Benchmarks 3.2 -- CG Benchmark CG Benchmark Completed. Class =A Size=14000 Iterations = 15 Time in seconds = 2.48 Total processes =2 Compiled procs =2 Mop/s total = 604.46 Not quite linear, but one must account for all of the OS traffic that one core or the other must deal with. Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Fayetteville, Arkansas 72701 (479) 575 - 4590 http://hpc.uark.edu "A supercomputer is a device for turning compute-bound problems into I/O-bound problems." -Seymour Cray Jeff Squyres wro
Re: [OMPI users] Problem running MPI on a dual-core pentium D
Hey Victor! I just ran the old classic cpi.c just to verify that OpenMPI was working. Now I need to grab some actual benchmarking code. I may try the NAS Parallel Benchmarks from here... http://www.nas.nasa.gov/Resources/Software/npb.html They were pretty easy to build and run under mpich. I can't imagine it'd be any different on OpenMPI. Jeff F. Pummill victor marian wrote: I can't turn it off right now to look in BIOS (the computer is not at home), but I think the Pentium D which is dual-core doesn't support hyper-threading. The program I made relies on an MPI library (it is not a benchmarking program). I think you are right, maibe I should run a benchmarking program first to see what happens. If you have a benchmarking program I would gladly test it. What is the best way to debug OpenMPI programs? Until now I ran prism which is part of the SunClusterTools. Victor --- Jeff Pummill <jpum...@uark.edu> wrote: Victor, Just on a hunch, look in your BIOS to see if Hyperthreading is turned on. If so, turn it off. We have seen some unusual behavior on some of our machines unless this is disabled. I am interested in your progress as I have just begun working with OpenMPI as well. I have used mpich for quite some time, but felt compelled to get some experience with OpenMPI as well. I just installed it this weekend on an AMD dual-core machine with 2 gigs of ram. Maybe I will try and replicate your experiment if you can direct me to what program you are benchmarking. Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Fayetteville, Arkansas 72701 (479) 575 - 4590 http://hpc.uark.edu victor marian wrote: The problem is that my executable file runs on the Pentium D in 80 seconds on two cores and in 25 seconds on one core. And on another Sun SMP machine with 20 processors it runs perfectly (the problem is perfectly scallable). Victor Marian Laboratory of Machine Elements and Tribology University Politehnica of Bucharest Romania --- Brock Palen <bro...@umich.edu> wrote: It means that your OMPI was compiled to support uDAPL (a type of infinibad network) but that your computer does not have such a card installed. Because you dont it will fall back to ethernet. But because you are just running on a single machine. You will use the fastest form of communication using shared memory. so you can ignore that message. Unless in the future you add a uDAPL powered network and you still get that message then you need to worry. Brock Palen Center for Advanced Computing bro...@umich.edu (734)936-1985 On Jun 10, 2007, at 9:18 AM, victor marian wrote: Hello, I have a Pentium D computer with Solaris 10 installed. I installed OpenMPI, succesfully compiled my Fortran program, but when giving mpirun -np 2 progexe I receive [0,1,0]: uDAPL on host SERVSOLARIS was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. I am a begginer in MPI and don't know what it means. What should I do to solve the problem? Thank you. __ __ Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games. http://sims.yahoo.com/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games. http://sims.yahoo.com/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users -- ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users Luggage? GPS? Comic books? Check out fitting gifts for grads at Yahoo! Search http://search.yahoo.com/search?fr=oni_on_mail=graduation+gifts=bz ___ users mailing list us...@open-mpi.org htt
Re: [OMPI users] Problem running MPI on a dual-core pentium D
Victor, Just on a hunch, look in your BIOS to see if Hyperthreading is turned on. If so, turn it off. We have seen some unusual behavior on some of our machines unless this is disabled. I am interested in your progress as I have just begun working with OpenMPI as well. I have used mpich for quite some time, but felt compelled to get some experience with OpenMPI as well. I just installed it this weekend on an AMD dual-core machine with 2 gigs of ram. Maybe I will try and replicate your experiment if you can direct me to what program you are benchmarking. Jeff F. Pummill Senior Linux Cluster Administrator University of Arkansas Fayetteville, Arkansas 72701 (479) 575 - 4590 http://hpc.uark.edu victor marian wrote: The problem is that my executable file runs on the Pentium D in 80 seconds on two cores and in 25 seconds on one core. And on another Sun SMP machine with 20 processors it runs perfectly (the problem is perfectly scallable). Victor Marian Laboratory of Machine Elements and Tribology University Politehnica of Bucharest Romania --- Brock Palenwrote: It means that your OMPI was compiled to support uDAPL (a type of infinibad network) but that your computer does not have such a card installed. Because you dont it will fall back to ethernet. But because you are just running on a single machine. You will use the fastest form of communication using shared memory. so you can ignore that message. Unless in the future you add a uDAPL powered network and you still get that message then you need to worry. Brock Palen Center for Advanced Computing bro...@umich.edu (734)936-1985 On Jun 10, 2007, at 9:18 AM, victor marian wrote: Hello, I have a Pentium D computer with Solaris 10 installed. I installed OpenMPI, succesfully compiled my Fortran program, but when giving mpirun -np 2 progexe I receive [0,1,0]: uDAPL on host SERVSOLARIS was unable to find any NICs. Another transport will be used instead, although this may result in lower performance. I am a begginer in MPI and don't know what it means. What should I do to solve the problem? Thank you. __ __ Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games. http://sims.yahoo.com/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users Moody friends. Drama queens. Your life? Nope! - their life, your story. Play Sims Stories at Yahoo! Games. http://sims.yahoo.com/ ___ users mailing list us...@open-mpi.org http://www.open-mpi.org/mailman/listinfo.cgi/users --