> Grid computing is interesting as a way to make the best of the cheap computing power provided by intel boxes, on the one hand, and the robustness of > the mainframe, on the other, opening new avenues for integrating and using various resources with their own strenghts. If i got it right, it seems that > applications need to be grid-aware to be able to use it effectively, which makes it a no-no as a short-term solution.
Not necessarily. If your computing cluster design is created in such a way that any node has access to the same data at the same place (ie, via a distributed file system like AFS or GFS), then you can use GRAM as a "smart" remote job submission manager with non-specifically-grid-enabled binaries. Think of the idea of submitting a job to a queue manager that picks a machine, runs the binary, and returns output to you -- in the case of a X application or some such thing, you just need to make sure to supply a --display parm to ensure that whatever node runs the job directs the display back to your workstation. Example: you have a license for Framemaker on only two systems (it's node-locked to specific CPU ids). Define a GRAM resource group with only those two systems in it. On your client systems, write a small script that when executed as /usr/local/bin/frame, does a gram-submit to the resource group containing the two machines licensed to run Framemaker. GRAM picks the least loaded machine in your resource group, hands your job off to the job manager on one of those two machines, and runs Framemaker with the display pointed back to your workstation. Framemaker doesn't know squat about grids -- but you just used it successfully to run in a grid environment. > And then I had this idea when I was reading about openMosix. > For those of you who haven't heard, check the homepage at http://openmosix.sourceforge.net/. > [...] > So what if we could patch a zLinux image kernel and then made it one of the nodes of one of these clusters? If possible, we would have a way to cleanly > offload CPU intensive jobs from the linux/mainframe to cheaper external engines. OpenMosix works only with clusters of machines made up of single homogenous architectures. For this to work on 390, all your nodes would have to be 390 architecture -- the mainframe doesn't use IA32 processors of any type, so you could only migrate processes between processors of similar type (ie IA32 to IA32, regardless of manufacturer). So, unfortunately, this (OpenMosix) really doesn't buy you much at all. Wrt to Amoeba, it had the same restrictions as OpenMosix; all the Amoeba nodes had to be similar architecture for process migration to occur. "Stunning" a process during migration could have provided the opportunity to select an alternate architecture binary bundled into the executable, but Tannenbaum rightly claimed it to be too difficult a task to find a valid restart point in a different architecture binary (consider the CISC/RISC case -- yeesh. Ugly.). > Any mainframe and VM gurus care to comment? Is there any reason why this can't be done? Do we loose any more reliability features? Am I missing > something that makes it totally impractical? Consider this as an alternative approach: keep the command and control components of the grid on 390 (the GRAM resource managers and gatekeepers do not have to be on the same architecture as the processing farm itself), and keep the execution nodes on whatever architecture you want to deliver cheap MIPS. Produce a common global file system for all nodes by using AFS or GFS, and define a set of resource groups for different types of work. Use the 390 to submit jobs into the grid cluster and handle the mundane editing/file management stuff. This approach lets you use the grid capabilities to manage workload, route output, etc w/o forcing applications to rewrite code, and as your applications get more sophisticated about grids, actually start to exploit the environment consciously. The mainframe component could remain as the command/control element w/o problem (and take on some of the interesting storage management capabilities or central backups, etc), and as the grid I/O model continues to evolve, provide access to much more sophisticated storage management that the Unix world only dreams about. Consider also that this makes a lot of the scheduling and resource management tasks that are currently hard in the Unix-only world a lot simpler -- you get PROP and a lot of other sophisticated automation products that don't exist in the Unix world yet. We've published a number of papers (see http://www.sinenomine.net or any recent VM Tech Conference proceedings) about configurations like this one, and we have a number of customers doing variations on this theme at the moment. It works rather well, and lets you take advantage of the best of both worlds *and* be prepared for the larger scale grid deployments. -- db
