On Fri, Oct 06, 2006 at 03:35:30PM -0600, [EMAIL PROTECTED] wrote: > Quoting Douglas Roberts <[EMAIL PROTECTED]>: > > > If you go to any of the supercomputing centers such as NCSA, SDSC, or PSC, > > you do not see parallel java apps running on any of their machines (with the > > occasional exception of a parallel newbie trying, with great difficulty to > > make something work). The reasons: > > > > 1. there are few supported message passing toolkits that support > > parallel java apps, > > 2. java runs 3-4 times slower than C, C++, Fortran, and machine time > > is expensive, and finally > > 3. there are well-designed and maintained languages, tookits and APIs > > for implementing HPC applications, and || developers use them instead of > > java. > > I expect in the next few years some supercomputing niches will start to use > hypervisors like Xen. By paying the 20% cost or so this will allow queueing > systems like LSF to bring jobs on and off line with uniform checkpointing. It > will remove the need for ad-hoc checkpointing code in applications by allowing > any executable to be stored (much a laptop does when it goes to sleep) and/or > migrated from one system to another. > > I would be very happy I could submit a job and have it run indefinitely to > completion...instead of having it kicked out every 6-12 hours for manual > restart > or a procedure where I have to write scripts to figure out where things stand > and adaptive resubmit. 20% is nothing compared to that inefficency!
Interesting comment, but checkpointing of single image tasks was never a real showstopper. It was a practical option on many systems, eg Irix, and could have been solved for Linux at any time. Also EcoLab (for agent based modelling) provides trivial checkpointing functionality for serial codes - but it does get more interesting when using it in parallel. However Xen will not solve the showstoppers that occur for codes that use sockets - ie all distributed memory message passing jobs, and jobs using floating licensed commercial software. I would prefer that the use of Xen be simply a user specifiable option for doing checkpointing. > > While there are well-designed and maintained languages for HPC (MPI and > OpenMP), > they are only the most basic of infrastructure. MPI is a pain to use and > OpenMP > requires a big SMP system. Maybe when there are Hypertransport cables and > implementations of languages like Fortress or Chapel life will be better. > (e.g. > Infiniband hasn't resulted in useful and used distributed shared memory > systems.) > ClassdescMP takes away most of the pain of MPI. There other options too, for example the recently added Boost.MPI package. I'm planning on taking a look at Boost.MPI to see how it compares with ClassdescMP... (Note I'm being a one-eyed C++ person here, though...) > > ============================================================ > FRIAM Applied Complexity Group listserv > Meets Fridays 9a-11:30 at cafe at St. John's College > lectures, archives, unsubscribe, maps at http://www.friam.org -- *PS: A number of people ask me about the attachment to my email, which is of type "application/pgp-signature". Don't worry, it is not a virus. It is an electronic signature, that may be used to verify this email came from me if you have PGP or GPG installed. Otherwise, you may safely ignore this attachment. ---------------------------------------------------------------------------- A/Prof Russell Standish Phone 0425 253119 (mobile) Mathematics UNSW SYDNEY 2052 [EMAIL PROTECTED] Australia http://parallel.hpc.unsw.edu.au/rks International prefix +612, Interstate prefix 02 ---------------------------------------------------------------------------- ============================================================ FRIAM Applied Complexity Group listserv Meets Fridays 9a-11:30 at cafe at St. John's College lectures, archives, unsubscribe, maps at http://www.friam.org
