Hi,
Alan Sill wrote:
On May 27, 2008, at 12:20 AM, Yuriy wrote:
We have 10 node cluster with 2
quad-core processors per node, and when number of jobs is greater then
160
...
(2) Alternatively, use a pilot-job or glide-in job submission scenario
to send a single job to the remote grid resource and interact with it
locally to handle your local submissions
great idea!
...
P.S.: I have encouraged the Globus team privately to consider
integrating pilot-job or glide-in capabilities more closely into the
core software to provide the user with easier hooks for this, and to
minimize the need for users to reinvent this type of workflow
control. There could be other ideas out there to handle this also.
Anyone eant to chime in?
The Falkon project is probably what you are looking for, a Globus
Incubator that gives you the glide-in capabilities that you are
referring to, and all this over web services. It also has a different
implementation in which a queued task (aka job) only consumes memory
resources, but does not consume any processor or network resources by
polling or waiting on processors to become available. This essentially
allows Falkon to have queues in the millions of tasks, without affecting
Falkon's performance. This doesn't mean you can ignore the need for
throttling, but the throttling can occur at a much higher threshold
(100K, 1M concurrent tasks), instead of 100s to 1000s that are typical
when running directly with GRAM/PBS/Condor/SGE. For more details on
Falkon, see the web site at
http://dev.globus.org/wiki/Incubator/Falkon. There are also Falkon
specific mailing lists where you might want to ask questions specific to
Falkon (http://dev.globus.org/wiki/Incubator/Falkon#Mailing_Lists).
Cheers,
Ioan
Hope this helps
Alan Sill, Ph.D
TIGRE Senior Scientist, High Performance Computing Center
Adjunct Professor of Physics
TTU
====================================================================
: Alan Sill, Texas Tech University Office: Admin 233, MS 4-1167 :
: e-mail: [EMAIL PROTECTED] ph. 806-742-4350 fax 806-742-4358 :
====================================================================
--
===================================================
Ioan Raicu
Ph.D. Candidate
===================================================
Distributed Systems Laboratory
Computer Science Department
University of Chicago
1100 E. 58th Street, Ryerson Hall
Chicago, IL 60637
===================================================
Email: [EMAIL PROTECTED]
Web: http://www.cs.uchicago.edu/~iraicu
http://dev.globus.org/wiki/Incubator/Falkon
http://dsl-wiki.cs.uchicago.edu/index.php/Main_Page
===================================================
===================================================