On Fri, Oct 06, 2006 at 03:35:30PM -0600, [EMAIL PROTECTED] wrote:
> Quoting Douglas Roberts <[EMAIL PROTECTED]>:
> 
> > If you go to any of the supercomputing centers such as NCSA, SDSC, or PSC,
> > you do not see parallel java apps running on any of their machines (with the
> > occasional exception of a parallel newbie trying, with great difficulty to
> > make something work).  The reasons:
> > 
> >    1. there are few supported message passing toolkits that support
> >    parallel java apps,
> >    2. java runs 3-4 times slower than C, C++, Fortran, and machine time
> >    is expensive, and finally
> >    3. there are well-designed and maintained languages, tookits and APIs
> >    for implementing HPC applications, and || developers use them instead of
> >    java.
> 
> I expect in the next few years some supercomputing niches will start to use
> hypervisors like Xen.   By paying the 20% cost or so this will allow queueing
> systems like LSF to bring jobs on and off line with uniform checkpointing.  It
> will remove the need for ad-hoc checkpointing code in applications by allowing
> any executable to be stored (much a laptop does when it goes to sleep) and/or
> migrated from one system to another.  
> 
> I would be very happy I could submit a job and have it run indefinitely to
> completion...instead of having it kicked out every 6-12 hours for manual 
> restart
> or a procedure where I have to write scripts to figure out where things stand
> and adaptive resubmit.  20% is nothing compared to that inefficency!

Interesting comment, but checkpointing of single image tasks was never a
real showstopper. It was a practical option on many systems, eg Irix,
and could have been solved for Linux at any time.

Also EcoLab (for agent based modelling) provides trivial checkpointing
functionality for serial codes - but it does get more interesting when
using it in parallel.

However Xen will not solve the showstoppers that occur for codes that
use sockets - ie all distributed memory message passing jobs, and jobs
using floating licensed commercial software.

I would prefer that the use of Xen be simply a user specifiable option
for doing checkpointing. 

> 
> While there are well-designed and maintained languages for HPC (MPI and 
> OpenMP),
> they are only the most basic of infrastructure.  MPI is a pain to use and 
> OpenMP
> requires a big SMP system.   Maybe when there are Hypertransport cables and
> implementations of languages like Fortress or Chapel life will be better. 
> (e.g.
> Infiniband hasn't resulted in useful and used distributed shared memory 
> systems.)
> 

ClassdescMP takes away most of the pain of MPI. There other options too,
for example the recently added Boost.MPI package. I'm planning on
taking a look at Boost.MPI to see how it compares with ClassdescMP...

(Note I'm being a one-eyed C++ person here, though...)

> 
> ============================================================
> FRIAM Applied Complexity Group listserv
> Meets Fridays 9a-11:30 at cafe at St. John's College
> lectures, archives, unsubscribe, maps at http://www.friam.org

-- 
*PS: A number of people ask me about the attachment to my email, which
is of type "application/pgp-signature". Don't worry, it is not a
virus. It is an electronic signature, that may be used to verify this
email came from me if you have PGP or GPG installed. Otherwise, you
may safely ignore this attachment.

----------------------------------------------------------------------------
A/Prof Russell Standish                  Phone 0425 253119 (mobile)
Mathematics                              
UNSW SYDNEY 2052                         [EMAIL PROTECTED]             
Australia                                http://parallel.hpc.unsw.edu.au/rks
            International prefix  +612, Interstate prefix 02
----------------------------------------------------------------------------


============================================================
FRIAM Applied Complexity Group listserv
Meets Fridays 9a-11:30 at cafe at St. John's College
lectures, archives, unsubscribe, maps at http://www.friam.org

Reply via email to