Edward Ned Harvey wrote: > > By default, it will distribute jobs every 15 sec, but it can be configured > down to 1sec. So if you have jobs of 2sec, the efficiency might not be great. > > You can have job dependencies, but I'm not sure how much knowledge it has > that's relevant to your situation. If you submit a job, let's say jobid is > 500, and you submit another job, let's say 501, you can make job 501 depend > on 500. There isn't any conditional detection of "pass/fail" on 500 ... just > a scheduling delay to ensure 501 runs after 500. > These are some typical differences between 'load sharing' and 'traditional batch' facilities. Some products work at bridging the gap (some more successfully than others), but most products are more firmly aimed at either 'give me jobs and spread them across as many machines as I can get my hands on' (load sharing) and 'run job x based on conditions y on host z' (traditional batch).
The comment above about 'seconds' and 'distribute' is important. A typical load sharing system is working hard to spread work across all of the machines that it 'owns'. Typical uses are software builds. For the second comment - builds have few dependencies - these are usually resolved in a 'make' or similar process. Batch processes often have a different nature. They are often scheduled to occur at a specific time rather than 'ASAP', and their delivery may be associated with SLAs (Service Level Agreements) where the output of a job must arrive on the C[E|F|I]O's desk at 8 am, under penalty of dismemberment if it's late! Batch jobs often have complex dependency trees with 'corrective' processes that come in to play if a particular step or job fails. Batch processing scripts/controls may have a rich language to allow for fine-grained control. Load sharers usually do not. Batch processing has evolved from the mainframe era, where Computer Operators (yes, the job title was capitalised :-) put decks of cards into readers, often boxes at a time, and the scheduler (e.g., HASP or JES2 on IBM mainframes) queued up and held all of the jobs. The operators had a worksheet (or workbook) that told them what jobs to release and when, and what to do if a job failed.(This may have involved running a corrective job or paging someone.) Modern batch systems replace the operator with code (to one exent or another). [And yah, when I started in computing this is how it worked. :-] If you need load sharing, it would seem that systems like SGE and LSF are what you are looking for. If you need batch processing, Autosys, BMC, Orsyp, Tidal, and Tivoli are the places to look for answers. These are the commercial (or near-commercial solutions), I haven't worked with the Open Source alternatives in this field - which for batch processing is common since most companies want some company 'on the hook' if their batch processing system (usually inextricably linked to their 'bread-and-butter') fails. For load sharing, I'd certainly look over open systems like SGE. - Richard _______________________________________________ Tech mailing list [email protected] http://lopsa.org/cgi-bin/mailman/listinfo/tech This list provided by the League of Professional System Administrators http://lopsa.org/
