Re: [gt-user] How to distribute problems to multiple resources (computers)?

john alexander sanabria ordonez Mon, 19 Nov 2012 06:41:21 -0800

Well, GRAM interfaces with local batch schedulers such as SGE, Condor, PBS
and so on. In that context, GRAM hides details concerning with the
scheduler where a job is submitted. Then, you have a pool of clusters and
your duty is only submit the task and don't really care about the job
submission details.


GRAM also provides tools and services for monitoring job's status.

There exists some meta-schedulers such as GridWay where, AFAIK, you submit
your job and it decides in which grid computing resources (aka. cluster)
your job should run.

John,

On 19 November 2012 09:14, <[email protected]> wrote:

> Interesting. I thought that's the main reason why anyone would need a grid
> platform: to distribute problems to multiple computers. If GT5 is not doing
> that, then what is it doing?
>
> The quote below is from the documentation of GT5.2.2[1]:
> *"The Grid Resource Allocation and Management (GRAM5) component is used
> to locate, submit, monitor, and cancel jobs on Grid computing resources.
> GRAM5 is not a Local Resource 
> Manager<http://www.globus.org/toolkit/docs/5.2/5.2.2/gram5/#local-resource-manager>,
> but rather a set of services and clients for communicating with a range of
> different batch/cluster job schedulers using a common protocol. GRAM5 is
> meant to address a range of jobs where reliable operation, stateful
> monitoring, credential management, and file staging are important."
> *
> From my understanding, GRAM5 (part of GT5) aims at handling a range of
> jobs (i.e. not a single job) + monitoring..etc. I assume that since it
> claims handling a range of jobs, it should somehow figure out which nodes
> to distribute at.
>
> I think my expectations about GT5 are incorrect. Perhaps I am missing the
> key objectives of GT5. Any clarifications would be appreciate it.
>
> Regards,
> J
>
> On 11/19/2012 at 4:38 AM, "Steven C Timm" <[email protected]> wrote:
>
>   The GT4 globus toolkit did include an implementation of the Monitoring
> and Discovery Service, which can be used by a number of sites to advertise
> to some central service which could then tell the user where to
> globus-job-submit (or globusrun-ws –submit as GT4 did.)
>
> In practice most production grids have some other non-globus method of
> telling the user which sites are available and now many free
>
> Slots that they have.  Most common one is the BDII.  The Open Science Grid
> in the US uses that, but also uses software known
>
> As the GlideinWMS to present the whole grid as a single unified resource
> to users.
>
>
>
> Steve Timm
>
>
>
>
>
> *From:* [email protected] [mailto:
> [email protected]] *On Behalf Of *[email protected]
> *Sent:* Sunday, November 18, 2012 5:52 PM
> *To:* gt-user
> *Subject:* [gt-user] How to distribute problems to multiple resources
> (computers)?
>
>
>
> Greetings GT community,
>
>
>
> Suppose that a pool of computers are able to donate their idle CPU time,
> how can a problem (i.e. an piece of code) get executed in them in a
> distributed manner?
>
>
>
> For example, when I use the command globus-job-submit, or globus-job-run,
> how will my local machine know where should these jobs to be submitted?
>
>
>
> I'm expecting that every resource should register itself to a discovery
> data base (service) that is hosted on a server(s). And that grid users
> (e.g. programmers/researchers) submit problems, they submit it somewhere
> that will dispatch them to multiple resources (CPU donators) according to a
> scheduler and an execution management plan that decies what to do in case
> of a failure.
>
>
>
> However, I fail to see how the above thoughts map to GT5 after following
> my reading of the quick start guide in
> http://www.globus.org/toolkit/docs/5.2/5.2.2/admin/quickstart/ -- what is
> in the guide is pretty controlled by the user/programmer (e.g. he specifies
> which computer to execute which commands on).
>
>
>
> Rgrds,
>
> J
>
>

Re: [gt-user] How to distribute problems to multiple resources (computers)?

Reply via email to