On Thu, Jan 10, 2008 at 02:59:27PM +0100, Jos Houtman wrote:
> For my master thesis I took up a project that requires mapping of a number of 
> statically defined parallel jobs into a more dynamic environment that allows 
> better scaling.  
> The situation as described below let me to believe a cluster or distributed 
> queue (DrQueue?) solution is necessary. For the situation see [situation] at 
> the end of this email.
Off the top of my head, many of your requirements are available in two
totally different apps:
- Gearman, written by Brad Fitzpatrick @ LiveJournal. Perl mainly, I
  think there are other interfaces as well to it.
- Torque/PBS - somewhat less of a fit, I'm not certain about running
  perpetual jobs.

You may also need some degree of STONITH for the job running only once
during node failure case. (Say the job manager crashes, the job is still
running, but you have no control of it. You need to zap it hard).

-- 
Robin Hugh Johnson
Gentoo Linux Developer & Infra Guy
E-Mail     : [EMAIL PROTECTED]
GnuPG FP   : 11AC BA4F 4778 E3F6 E4ED  F38E B27B 944E 3488 4E85

Attachment: pgpLUGv5ZvP8a.pgp
Description: PGP signature

Reply via email to