Hi all, Jan & Martin & Tino, thanks for your help. I tried MPICH and also read the instructions of 'Submitting Condor Jobs to Globus Toolkit 4' at https://bi.offis.de/wisent/tiki-index.php?page=Condor-GT4. But I'm still quite confusing.
What I really want to do is just similar as described in the beginning of 'Chapter 4 Execution Management' in 'GT4_Primer_0.6.pdf', that is, So you want to: . Make a program available as a network service (with size varying) . Dispatch . Run an executable on a remote computer. . Run an parallel program across multiple distributed computers. . Run a set of loosely coupled tasks . Steer a computation (?) These tasks all fall within the purview of execution management. I'm just going to try out the idea of distributed computing, no need of parallel computing. For example, the server sends out some commands to the grid nodes, and after the grid nodes execute the commands, results are collected back to the server. Have I made myself clearly? Hope for your guidance and instructions. Regards, Denny(Deming Yin) The grid really makes me confusing and exhausted...:) -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Jan Ploski Sent: Thursday, 8 May 2008 5:44 PM To: demingyin Cc: [email protected] Subject: Re: [gt-user] some problem of WS GRAM [EMAIL PROTECTED] schrieb am 05/08/2008 08:19:04 AM: > Hi all, > > These days I?m trying to use WS GRAM to submit some jobs. But I?m > still not quite understanding the mechanism WS GRAM. I can now > submit some dummy job on my Grid node, such as, > ?globusrun-ws -submit -c /bin/touch touched_it? > Or ?globusrun-ws -submit ?S ?f a.rsl?. > > But for example, if I want to add 1 to 2n, and in order to speed up > the process, I want to add 1 to n on Grid Node1, and n+1 to 2n on > Grid Node2. How could I do that? > Maybe first I should write a web service following online > Documentation ?Submitting a job in Java using WS GRAM?, and then how > WS GRAM can distribute the task to different Grid Nodes for me? > > Can anyone give some directions? Some detail example would be much > appreciated. There are several ways, none too easy. 1. Write a small MPI program and submit a job of type 'mpi' to run it. Within your program you distribute work to the nodes using MPI calls and gather the results. This is the best performing solution, but it forces you to program in C/Fortran, and it depends on MPI being available and correctly configured at the target site. 2. Submit two jobs, each of which does a part of the computation and stores away the results (say, in a file). Submit a third job which combines the results. Because you have three interdependent jobs, you will already need a metascheduler/workflow engine to coordinate the submissions automatically. 3. Submit a job of type 'multiple' which then does all the process coordination on site. Because the job type 'multiple' simply runs the specified executable, with the same command-line arguments, on n nodes, you will need some mechanism to compute the process numbers within that executable in order to allocate work to processes. It should also synchronize executions because you want the results to be combined in the end. You could use our 'MultiJob' module, described at https://bi.offis.de/wisent/tiki-index.php?page=Condor-GT4-BigJobs The examples provided on this page assume that you're using Condor as your job submission client, however, they could be adapted to use globusrun-ws (see also http://www.teragridforum.org/mediawiki/index.php?title=WS-Gram) Regards, Jan Ploski
