John, Along the same lines of your hobby "computational chemistry" is a project I am involved in: Folding. See the http://folding.stanford.edu for more information. Anyway, Stanford uses this COSM software as their distributed computing management software to assign workunits and collect the results from thousands of machines; http://www.mithral.com/projects/cosm/
It is possible to adapt their software platform to your collection of SMP servers. It would likely take some effort on your part, but it may be worth it. James, ----- Original Message ---- From: John McKelvey <[EMAIL PROTECTED]> To: [email protected] Sent: Monday, January 7, 2008 8:52:36 AM Subject: [fwlug] Fwd: Linux clustering ---------- Forwarded message ---------- From: John McKelvey <[EMAIL PROTECTED]> Date: Jan 7, 2008 8:41 AM Subject: Re: [fwlug] Linux clustering To: JAMES SCOTT <[EMAIL PROTECTED]> James, Many thanks for your comments! The one about crossmounting a directory "made some lights go on." Getting data back from other machines was going to be an issue. I think crossmounting reduces the problem to doing a remote procedure call to the other machine; I can have identical executables and run time libraries on each machine. Things are programmed entirely in fortran [ I know ... At one time I knew Algol.. :-) I started computing with that in 1965. Computational chemists do most all cpu intensive stuff in fortran, in the past often worrying in the past about things like the impact of file block size and disk track length on IO performance. ] I have found that the 'call system(" ")' command in fortran lets me do a lot of "command line" things easily. This all gets me to seeing better both the forest and the trees, but I'm sure I will need an additional suggestion or two.. please feel free to make other comments! Gratefully, John On Jan 7, 2008 12:17 AM, JAMES SCOTT <[EMAIL PROTECTED]> wrote: John, Rob's reply is a good starting point for what I think of as a command channel. The is a logical data channel that you can set up to complement the command and make data collection easier: NFS or shared disk. Simply enable nfs on both machines and 'export' (share) a directory from one, then 'mount' (use) it from the other; they now have a single directory in common ( i.e. the data channel is established). With a data channel in place you could write 'bash' scripts to query a 'new-work' file and execute any found commands. Be sure to add some type of queuing or locking mechanism to prevent nodes from reading the work-file while the main wkstn is adding new commands to the file. Tell me more about the work steps and I might be interested in writing the scripts, or at least getting you started. As I'm sure you know there are options available for setting up a cluster. How much change are you willing to impose on the machines current configurations? I.E. a true (tightly-coupled) cluster configuration would limit these machine general purpose usage. The suggested use of a data & command channel is comparable to loosely-coupled cluster/grid. There might be a remote-job-submission program already available; search google for a 'how-to' on the subject. Although I think their tools are true GRID/cluster related, you might be interested in this site http://www.cse.scitech.ac.uk/about_us/index.shtml. Also, I'm available to help as are others; but how much help were you looking for? James, ----- Original Message ---- From: John McKelvey <[EMAIL PROTECTED]> To: [email protected] Sent: Saturday, January 5, 2008 10:14:33 PM Subject: [fwlug] Linux clustering Hello! I am a retired chemist and a user and abuser of computers [i.e. for fun I do computational chemistry, and keep a dual-dual 4-processor AMD box running RHEL4 cranking 24/7. I have an additional box that is a dual-core Xeon that I would like to cluster with the AMD box. I run only _extremely _ coarse grained parallel codes, and identical executables running on any linux box... I run a fitting procedure that runs a particular executable on hundreds of examples, one at a time, collects results, adjusts parameters, and does it all again, over and over, till finished. There is no communication between nodes. Each node does a complete, seperate discreet task Node0 knows when a pass through the data has been completed, adjust parameters, and farms out jobs, over and over] .. but I'm not much of a systems person.. I have this running OK on the SMP box... just need to know how to farm out some of the work to the Xeon box. There is very little data moved around so standard old ethernet through my Verizon router should be fine. [4 machines are cabled in, plus a wireless machine.] I need a bit of help and advice. Is there someone available for helping me get this going? Many thanks! John McKelvey
_______________________________________________ Fwlug mailing list [email protected] http://fortwaynelug.org/mailman/listinfo/fwlug_fortwaynelug.org
