Re: [fwlug] Fwd: Linux clustering

Rob Ludwick Tue, 08 Jan 2008 00:08:48 -0800

I think James' comments are very good here.

Depending on how much data you're going to send to it, SSH has a couple
of neat tricks.


Take for example the following command:

ssh [EMAIL PROTECTED] "remote_command" <local_infile.txt >local_outfile.txt

Data is read from the local_infile.txt passed to the standard input of
remote_command and output data is written to local_outfile.txt. 

The trick is that remote_command needs to be able to process the data
from standard input and write the results to standard output. 

Roughly, a general rule of thumb is that ssh can encrypt 2 megabytes of
data per second.  So if your data set is larger than say a few megabytes
you'll want to do something else to get your data in to the process.

Interestingly enough, ssh also preserves standard error as well, giving
you a second data pipe for error and/or status.  This can be captured by
appending " 2>local_errfile.txt" to the command above. (Provided of
course that your remote process makes use of standard error for
something useful.)

--R


On Mon, 2008-01-07 at 08:52 -0500, John McKelvey wrote:
> 
> 
> ---------- Forwarded message ----------
> From: John McKelvey <[EMAIL PROTECTED]>
> Date: Jan 7, 2008 8:41 AM
> Subject: Re: [fwlug] Linux clustering
> To: JAMES SCOTT <[EMAIL PROTECTED]>
> 
> 
> James,
> 
> Many thanks for your comments!  The one about crossmounting a
> directory "made some lights go on."  Getting data back from other
> machines was going to be an issue.  I think crossmounting  reduces the
> problem to doing a remote procedure call to the other machine; I can
> have identical executables and run time libraries on each machine.
> Things are programmed entirely in fortran [ I know ... At one time I
> knew Algol.. :-)  I started computing with that in 1965.
> Computational chemists do most all cpu intensive stuff in fortran, in
> the past often worrying in the past about things like the impact of
> file block size and disk track length on IO performance. ]  I have
> found that the 'call system("  ")' command in fortran lets me do a lot
> of "command line" things easily. 
> 
> This all gets me to seeing better both the forest and the trees, but
> I'm sure I will need an additional suggestion or two..  please feel
> free to make other comments!
> 
> Gratefully,
> 
> John
> 
> 
> 
> On Jan 7, 2008 12:17 AM, JAMES SCOTT <[EMAIL PROTECTED]> wrote:
>         John,
>          
>         Rob's reply is a good starting point for what I think of as a
>         command channel.  The is a logical data channel that you can
>         set up to complement the command and make data collection
>         easier: NFS or shared disk.   Simply enable nfs on both
>         machines and 'export' (share) a directory from one, then
>         'mount' (use) it from the other; they now have a single
>         directory in common ( i.e. the data channel is established).
>          
>         With a data channel in place you could write 'bash' scripts to
>         query a 'new-work' file and execute any found commands.  Be
>         sure to add some type of queuing or locking mechanism to
>         prevent nodes from reading the work-file while the main wkstn
>         is adding new commands to the file.  Tell me more about the
>         work steps and I might be interested in writing the scripts,
>         or at least getting you started.
>          
>         As I'm sure you know there are options available for setting
>         up a cluster.  How much change are you willing to impose on
>         the machines current configurations? I.E. a true
>         (tightly-coupled) cluster configuration would limit these
>         machine general purpose usage.   The suggested use of a data &
>         command channel is comparable to  loosely-coupled
>         cluster/grid.  There might be a remote-job-submission program
>         already available; search google for a 'how-to' on the
>         subject. 
>          
>         Although I think their tools are true GRID/cluster related,
>         you might be interested in this site
>         http://www.cse.scitech.ac.uk/about_us/index.shtml.  
>          
>         Also, I'm available to help as are others;  but how much help
>         were you looking for?  
>          
>         James,
>         
>          
>         ----- Original Message ----
>         From: John McKelvey <[EMAIL PROTECTED]>
>         To: [email protected]
>         Sent: Saturday, January 5, 2008 10:14:33 PM
>         Subject: [fwlug] Linux clustering
>         
>         
>         Hello!
>         
>         
>         I am a retired chemist and a user and abuser of computers
>         [i.e. for fun I do computational chemistry, and keep a
>         dual-dual 4-processor AMD box running RHEL4 cranking 24/7.  I
>         have an additional box that is a dual-core Xeon that I would
>         like to cluster with the AMD box.  I run only _extremely _
>         coarse grained parallel codes, and identical executables
>         running on any linux box... I run a fitting procedure that
>         runs a particular executable on hundreds of examples, one at a
>         time, collects results, adjusts parameters, and does it all
>         again, over and over, till finished.  There is no
>         communication between nodes. Each node does a complete,
>         seperate discreet task  Node0 knows when a pass through the
>         data has been completed, adjust parameters, and farms out
>         jobs, over and over]  .. but I'm not much of a systems
>         person.. I have this running OK on the SMP box... just need to
>         know how to farm out some of the work to the Xeon box.  There
>         is very little data moved around so standard old ethernet
>         through my Verizon router should be fine. [4 machines are
>         cabled in, plus a wireless machine.] 
>         
>         I need a bit of help and advice.  Is there someone available
>         for helping me get this going?
>          
>         Many thanks!
>          
>         John McKelvey
>         
>         
>         
>         
>         
>         
> 
> 
> 
> _______________________________________________
> Fwlug mailing list
> [email protected]
> http://fortwaynelug.org/mailman/listinfo/fwlug_fortwaynelug.org


_______________________________________________
Fwlug mailing list
[email protected]
http://fortwaynelug.org/mailman/listinfo/fwlug_fortwaynelug.org

Re: [fwlug] Fwd: Linux clustering

Reply via email to