Hi all,

Jan & Martin & Tino, thanks for your help. I tried MPICH and also read the
instructions of 'Submitting Condor Jobs to Globus Toolkit 4' at
https://bi.offis.de/wisent/tiki-index.php?page=Condor-GT4. But I'm still
quite confusing.

What I really want to do is just similar as described in the beginning of
'Chapter 4 Execution Management' in 'GT4_Primer_0.6.pdf', that is,
So you want to:
. Make a program available as a network service (with size varying)
. Dispatch
. Run an executable on a remote computer.
. Run an parallel program across multiple distributed computers.
. Run a set of loosely coupled tasks
. Steer a computation (?)
These tasks all fall within the purview of execution management.

I'm just going to try out the idea of distributed computing, no need of
parallel computing. For example, the server sends out some commands to the
grid nodes, and after the grid nodes execute the commands, results are
collected back to the server.

Have I made myself clearly? Hope for your guidance and instructions. 

Regards,
Denny(Deming Yin)
The grid really makes me confusing and exhausted...:)

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf
Of Jan Ploski
Sent: Thursday, 8 May 2008 5:44 PM
To: demingyin
Cc: [email protected]
Subject: Re: [gt-user] some problem of WS GRAM

[EMAIL PROTECTED] schrieb am 05/08/2008 08:19:04 AM:

> Hi all,
> 
> These days I?m trying to use WS GRAM to submit some jobs. But I?m 
> still not quite understanding the mechanism WS GRAM. I can now 
> submit some dummy job on my Grid node, such as,
> ?globusrun-ws -submit -c /bin/touch touched_it?
> Or ?globusrun-ws -submit ?S ?f a.rsl?.
> 
> But for example, if I want to add 1 to 2n, and in order to speed up 
> the process, I want to add 1 to n on Grid Node1, and n+1 to 2n on 
> Grid Node2. How could I do that?
> Maybe first I should write a web service following online 
> Documentation ?Submitting a job in Java using WS GRAM?, and then how
> WS GRAM can distribute the task to different Grid Nodes for me?
> 
> Can anyone give some directions? Some detail example would be much 
> appreciated. 

There are several ways, none too easy.

1. Write a small MPI program and submit a job of type 'mpi' to run it. 
Within your program you distribute work to the nodes using MPI calls and 
gather the results. This is the best performing solution, but it forces 
you to program in C/Fortran, and it depends on MPI being available and 
correctly configured at the target site.
2. Submit two jobs, each of which does a part of the computation and 
stores away the results (say, in a file). Submit a third job which 
combines the results. Because you have three interdependent jobs, you will 
already need a metascheduler/workflow engine to coordinate the submissions 
automatically.
3. Submit a job of type 'multiple' which then does all the process 
coordination on site. Because the job type 'multiple' simply runs the 
specified executable, with the same command-line arguments, on n nodes, 
you will need some mechanism to compute the process numbers within that 
executable in order to allocate work to processes. It should also 
synchronize executions because you want the results to be combined in the 
end. You could use our 'MultiJob' module, described at 
https://bi.offis.de/wisent/tiki-index.php?page=Condor-GT4-BigJobs
The examples provided on this page assume that you're using Condor as your 
job submission client, however, they could be adapted to use globusrun-ws 
(see also http://www.teragridforum.org/mediawiki/index.php?title=WS-Gram)

Regards,
Jan Ploski


Reply via email to