> Grid computing is interesting as a way to make the best of the cheap
computing power provided by intel boxes, on the one hand, and the robustness
of
> the mainframe, on the other, opening new avenues for integrating and using
various resources with their own strenghts. If i got it right, it seems that
> applications need to be grid-aware to be able to use it effectively, which
makes it a no-no as a short-term solution.

Not necessarily. If your computing cluster design is created in such a way
that any node has access to the same data at the same place (ie, via a
distributed file system like AFS or GFS), then you can use GRAM as a "smart"
remote job submission manager with non-specifically-grid-enabled binaries.
Think of the idea of submitting a job to a queue manager that picks a
machine, runs the binary, and returns output to you -- in the case of a X
application or some such thing, you just need to make sure to supply
a --display parm to ensure that whatever node runs the job directs the
display back to your workstation.

Example:  you have a license for Framemaker on only two systems (it's
node-locked to specific CPU ids).  Define a GRAM resource group with only
those two systems in it.  On your client systems, write a small script that
when executed as /usr/local/bin/frame, does a gram-submit to the resource
group containing the two machines licensed to run Framemaker. GRAM picks the
least loaded machine in your resource group, hands your job off to the job
manager on one of those two machines, and runs Framemaker with the display
pointed back to your workstation.  Framemaker doesn't know squat about
grids -- but you just used it successfully to run in a grid environment.

> And then I had this idea when I was reading about openMosix.
> For those of you who haven't heard, check the homepage at
http://openmosix.sourceforge.net/.
> [...]
> So what if we could patch a zLinux image kernel and then made it one of
the nodes of one of these clusters? If possible, we would have a way to
cleanly
> offload CPU intensive jobs from the linux/mainframe to cheaper external
engines.

OpenMosix works only with clusters of machines made up of single homogenous
architectures. For this to work on 390, all your nodes would have to be 390
architecture -- the mainframe doesn't use IA32 processors of any type, so
you could only migrate processes between processors of similar type (ie IA32
to IA32, regardless of manufacturer).  So, unfortunately, this  (OpenMosix)
really doesn't buy you much at all.

Wrt to Amoeba, it had the same restrictions as OpenMosix; all the Amoeba
nodes had to be similar architecture for process migration to occur.
"Stunning" a process during migration could have provided the opportunity to
select an alternate architecture binary bundled into the executable, but
Tannenbaum rightly claimed it to be too difficult a task to find a valid
restart point in a different architecture binary (consider the CISC/RISC
case -- yeesh. Ugly.).


> Any mainframe and VM gurus care to comment? Is there any reason why this
can't be done? Do we loose any more reliability features? Am I missing
> something that makes it totally impractical?

Consider this as an alternative approach: keep the command and control
components of the grid on 390 (the GRAM resource managers and gatekeepers do
not have to be on the same architecture as the processing farm itself), and
keep the execution nodes on whatever architecture you want to deliver cheap
MIPS.  Produce a common global file system for all nodes by using AFS or
GFS, and define a set of resource groups for different types of work. Use
the 390 to submit jobs into the grid cluster and handle the mundane
editing/file management stuff.

This approach lets you use the grid capabilities to manage workload, route
output, etc w/o forcing applications to rewrite code, and as your
applications get more sophisticated about grids, actually start to exploit
the environment consciously. The mainframe component could remain as the
command/control element w/o problem (and take on some of the interesting
storage management capabilities or central backups, etc), and as the grid
I/O model continues to evolve, provide access to much more sophisticated
storage management that the Unix world only dreams about. Consider also that
this makes a lot of the scheduling and resource management tasks that are
currently hard in the Unix-only world a lot simpler -- you get PROP and a
lot of other sophisticated automation products that don't exist in the Unix
world yet.

We've published a number of papers (see http://www.sinenomine.net or any
recent VM Tech Conference proceedings) about configurations like this one,
and we have a number of customers doing variations on this theme at the
moment. It works rather well, and lets you take advantage of the best of
both worlds *and* be prepared for the larger scale grid deployments.

-- db

Reply via email to