> So, you presumably have experience running the Sun Grid Engine[1]; how > does it > stack up in the scenario that I outlined?
Yes, I use SGE (sun grid engine) and LSF. SGE is free, LSF is not. They have subtle differences, but not enormous; the main reason to use LSF is for tool integration - for example - in cadence design tools, they partnered with LSF to create settings in the tool GUI menus, to allow seamless integration of the queueing system. If a user draws some circuits, and normally just clicks the "Run" button (or whatever) the user is still able to do that with LSF, and they don't need to know where the job actually ran; they get parallelization and serialization, license management, etc. For the purposes that you outlined - Stability will not be a problem, in that the daemons never crash or have any problems. Although, sometimes there may be inaccurate or confusing documentation. Also, if you intend to use the GUI, it's the most confusing unintuitive thing you've ever seen. How easy to manage over time - Not awesome. I usually use SGE for engineers, and they are very resistant to change. So I figure out how to get it right once, and simply don't upgrade anything until it's required for their tools to work, and then (once every 2-3 years) we rebuild all the servers and queueing system. I think there is no concern about upgrading the underlying system - although - it should be noted that once a new SGE is released, you can't download the old SGE anymore. So on the day that you download, you should get every one, even if you think you won't need it. Don't skip the solaris build just because you don't have solaris today. Etc. By default, it will distribute jobs every 15 sec, but it can be configured down to 1sec. So if you have jobs of 2sec, the efficiency might not be great. You can have job dependencies, but I'm not sure how much knowledge it has that's relevant to your situation. If you submit a job, let's say jobid is 500, and you submit another job, let's say 501, you can make job 501 depend on 500. There isn't any conditional detection of "pass/fail" on 500 ... just a scheduling delay to ensure 501 runs after 500. > > > I wonder what advantage there is to using something other than SGE? > > My question was based on ignorance of the tool, actually, so I have no > particular opinion on that question. > > Regards, > Daniel > > > Footnotes: > [1] ...assuming that is the SGE you mean here. > > -- > ✣ Daniel Pittman ✉ [email protected] ☎ +61 401 > 155 707 > ♽ made with 100 percent post-consumer electrons > > _______________________________________________ > Tech mailing list > [email protected] > http://lopsa.org/cgi-bin/mailman/listinfo/tech > This list provided by the League of Professional System Administrators > http://lopsa.org/ _______________________________________________ Tech mailing list [email protected] http://lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
