Is it just me, or does HPC clustering and virtualization fall on
opposite ends of the spectrum?

depends on your definitions.  virtualization certainly conflicts with
those aspects of HPC which require bare-metal performance.  even if you
can reduce the overhead of virtualization, the question is why?  look
at the basic sort of HPC environment: compute nodes running a single distro, controlled by a scheduler. from the user's or job's perspective, there are just some nodes - which ones doesn't matter, or even how many in total. the user _should_ be able to assume that when they land on a node,
it behaves as if freshly installed and booted de novo.  we don't reboot
nodes nodes between jobs, of course, or even make much effort towards
preventing a serial job from noticing other serial jobs on the same node (as containers would, let alone VMs). but we could, without tons of effort, just lower utilization.

virtualization is about a few things:
        - improve utilization by coalescing low-duty-cycle services.
        - isolate services from each other - either to directly arbitrate
        runtime resource contention, or to disentangle configurations.
        - encapsulate all the state of a server so it can be moved.

I think the first axis is quite non-HPC, since I don't think of HPC jobs
as being like idle services.  (OTOH, many clusters have good utilization
because multiple workloads get interleaved _above_ the processor level.)
the second factor is not often an HPC problem, at least not in my experience,
where J Random Fortran user doesn't really care that much about the
environment (ie - want f77 and lapack and empty queues). migration has some HPC appeal, since it permits defragmenting a cluster, as well as better preemption.

Gavin, not necessarily. You could have a cluster of HPC compute nodes
running a minimal base OS.
Then install specific virtual machines with different OS/software stacks
each time your run a job.

or for each job, just install the provided OS image on the bare metal...
your job's done, have it halt or reboot the node ;)

OK, this is probably more relevant for grid or cloud computing - I first

grid and cloud computing are all part of the same game, no? along with massively parallel low-latency MPI, old-style vector supercomputing, GPU-assisted computing, throughput serial farming, etc.

thought this would be a good idea when seeing
that (at the time) the CERN LHC Grid software would only run with Redhat
7.2
So you could imagine 'packaging up' a virtual machine which has your
particular OS flavour/libraries/compilers and shipping
it out with the job.

right, that's one of the axes of the problem-space: whether the app gets its own custom runtime environment (in the sense of kernel, libc, etc). another axis is the degree to which the app has to contend for resources (as in an overcommited normal cluster, or a VM without guaranteed resources.)

Another reason could be fault tolerance - you run VMs on the compute
nodes. When you detect a hardware fault is coming along
(eg from ECC errors or disk errors) you perform a live migration from
one node to another - and your job keeps on trucking.
(In theory, checkpointing needed etc. etc.)

I'm pretty skeptical about this - the main issue with checkpointing is when there are external side-effects. checkpointing networked apps
(including MPI) is hard because you have state "in flight", so can only
freeze-dry the state by quiescing (letting the messages land, etc).

the "live migration" demos I've seen have been apps that are tolerant to the loss in-flight transactions (or which retry automatically).

so I don't think virt is any kind of paradigm-changer, just like manycore merely stretches existing definitions.

-mark
_______________________________________________
Beowulf mailing list, [email protected] sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to