Re: [gt-user] How to set cores-per-node in WS job submission?

Jan Ploski Tue, 06 May 2008 11:33:25 -0700

Steve White wrote:

Jan,
I agree with your assessment that the need to adjust the memory use per
process is a general one in cluster job submission, and that it is in
some way implemented by any underlying job management system, and that
these extensions ought not to be PBS-specific.

I also looked at your "messy solution".  (The code looks very professional,
really.)  It won't do for my purposes, because I need to present a minimal,
easily understood solution.

Let me explain my situation:

None of the compute resources is under my control.  I can point out
problems to admins, that is all.

I have been assigned two jobs.
I and our users are familiar with doing conventional cluster job submission.One job was to bring them into the grid fold, showing them the advantagesof globusrun-ws. If it can be shown to be really a cross-platformsolution, giving them the ability to (almost) effortlessly switchbetween grid clusters, the effort will be a success.
My other job is to write a report on practical MPI job submission over
the grid.

We have come a long way, but still have to deal with a couple of practical
details. At this point, it looks like both of them will end up aswork-arounds to incomplete implementation of a job submission interface
in Globus.

If with a future release of Globus, these issues can be dealt with, grid
job submission will look very attractive to real researchers.

Hi,

Based on my experience with Globus, you might be following a wrong route(the route to disenchantment). I view Globus more as a middleware thathas to be adapted (as in: "wrapped around" or "slightly modified")according to your users' needs and which plays an important role behindthe scenes, but it probably should not be exposed directly to users as adrop-in replacement for their familiar job submission tools.

There is a reason for that more important than the limitations you havediscovered so far: Globus doesn't ship with command-line job managementcommands on par with those of TORQUE/Maui, Condor or SGE. If you letusers submit jobs with globus-job-submit, the next thing they are goingto ask you is "how can I see what jobs I have submitted", "how can Icancel the job or resubmit it elsewhere", "is my job running or not","why is my job not running", "when is my job going to start", etc.

You need something in front of Globus to make your users' life bearable.Some projects lean toward application-specific web portals (I thinkthat's AstroGrid's approach). In our project, we have deployed a largelyapplication-agnostic frontend based on Condor-G, but even so there wassome customization and some user training required. The Condor-Gapproach might be relevant for you because it covers the scenario ofmaking a transparent transition from a local batch system to a Grid -the Condor tools for submitting jobs and status querying are pretty muchthe same regardless of whether your job goes to a machine from a localpool (equivalent to an SGE or PBS-managed cluster) or to a pool ofGlobus hosts. (In fact, Condor can submit to GT2 [gLite], GT4, Unicore,and some more Grid middlewares.)

The disadvantage of Condor is that it is a rather huge software productand trying to understand all of it can be daunting. Still, I suppose youcould get the Grid submission piece of it running in a couple of hoursif you wish to give it a try (by following our tutorials and askingquestions where necessary).


Regards,
Jan Ploski

Re: [gt-user] How to set cores-per-node in WS job submission?

Reply via email to