On Thu, 16 Jul 2009 12:16:14 -0400 Doug Hughes wrote: Doug> Narayan Desai wrote: Doug> > On Thu, 16 Jul 2009 11:15:48 -0400 Edward Ned Harvey wrote: Doug> > Doug> > Ned> > I am interested in soliciting experiences deploying, using and Doug> > Ned> > maintaining the Doug> > Ned> > Condor batch processing system, especially under Linux / Debian. Doug> > Ned> > Ned> > Our use would predominantly be many small jobs, Doug> > rather than a few large Doug> > Ned> > jobs, Doug> > Ned> > with runtimes measured in a few hours. Probably only a handful of Doug> > Ned> > nodes, on Doug> > Ned> > the order of half a dozen, in total.[1] Doug> > Doug> > Doug> > Ned> I don't know anything about condor, or torque. The obvious Doug> > Ned> choice to me would be SGE. I wonder what advantage there is to Doug> > Ned> using something other than SGE? Doug> > Doug> > Well, the area where condor is pretty much the undisputed king is in the Doug> > scavenger arena. The basic idea is that you could deploy condor on top Doug> > of your regular desktops and jobs would be deployed to use wasted Doug> > cycles (during idle periods or on a set schedule, etc). -nld Doug> > Doug> > Doug> Doesn't it also excel at the whole state/migration thing? E.G. you can Doug> take a node out for maintenance and migrate a running job off to Doug> another node by saving the memory state and performing the migration Doug> and then resuming the job. (May only work for some job configurations)
So I hear. I don't have any direct experience with the checkpointing/migration stuff. I gather they are starting to use VMs for this sort of thing as well as library-based checkpointing. -nld _______________________________________________ Tech mailing list [email protected] http://lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
