> So, you presumably have experience running the Sun Grid Engine[1]; how
> does it
> stack up in the scenario that I outlined?

Yes, I use SGE (sun grid engine) and LSF.  SGE is free, LSF is not.  They have 
subtle differences, but not enormous; the main reason to use LSF is for tool 
integration - for example - in cadence design tools, they partnered with LSF to 
create settings in the tool GUI menus, to allow seamless integration of the 
queueing system.  If a user draws some circuits, and normally just clicks the 
"Run" button (or whatever) the user is still able to do that with LSF, and they 
don't need to know where the job actually ran; they get parallelization and 
serialization, license management, etc.

For the purposes that you outlined -

Stability will not be a problem, in that the daemons never crash or have any 
problems.  Although, sometimes there may be inaccurate or confusing 
documentation.  Also, if you intend to use the GUI, it's the most confusing 
unintuitive thing you've ever seen.

How easy to manage over time - Not awesome.  I usually use SGE for engineers, 
and they are very resistant to change.  So I figure out how to get it right 
once, and simply don't upgrade anything until it's required for their tools to 
work, and then (once every 2-3 years) we rebuild all the servers and queueing 
system.

I think there is no concern about upgrading the underlying system - although - 
it should be noted that once a new SGE is released, you can't download the old 
SGE anymore.  So on the day that you download, you should get every one, even 
if you think you won't need it.  Don't skip the solaris build just because you 
don't have solaris today.  Etc.

By default, it will distribute jobs every 15 sec, but it can be configured down 
to 1sec.  So if you have jobs of 2sec, the efficiency might not be great.

You can have job dependencies, but I'm not sure how much knowledge it has 
that's relevant to your situation.  If you submit a job, let's say jobid is 
500, and you submit another job, let's say 501, you can make job 501 depend on 
500.  There isn't any conditional detection of "pass/fail" on 500 ... just a 
scheduling delay to ensure 501 runs after 500.




> 
> > I wonder what advantage there is to using something other than SGE?
> 
> My question was based on ignorance of the tool, actually, so I have no
> particular opinion on that question.
> 
> Regards,
>         Daniel
> 
> 
> Footnotes:
> [1]  ...assuming that is the SGE you mean here.
> 
> --
> ✣ Daniel Pittman            ✉ [email protected]            ☎ +61 401
> 155 707
>                ♽ made with 100 percent post-consumer electrons
> 
> _______________________________________________
> Tech mailing list
> [email protected]
> http://lopsa.org/cgi-bin/mailman/listinfo/tech
> This list provided by the League of Professional System Administrators
>  http://lopsa.org/


_______________________________________________
Tech mailing list
[email protected]
http://lopsa.org/cgi-bin/mailman/listinfo/tech
This list provided by the League of Professional System Administrators
 http://lopsa.org/

Reply via email to