>
> ​I keep longing for folks with decades of experience in HTC&HPC to chime
> in "on-list".


FWIW, I come from that background, but, am not in that space at this time.
My prior life was in developing a (not open source) distributed job
scheduler and management system for batch and interactive jobs that handled
dependencies, deadlines, preemptions, advance reservation of resources,
etc. with multi-level priority and share tree hierarchy based allocation.
Typically, dependencies and deadlines are handled outside of schedulers and
fed into schedulers as task submission after dependencies have been met. We
found it more optimal to have the scheduler resolve dependencies and
deadlines inherently. This way, a high priority job dependent on another
low priority job can induce higher priority on that dependent job.
Similarly, a job with a deadline depending on another job's completion can
induce an earlier launch of the latter job in order to meet it's deadline.
Also, a dependent job can reserve its resources in advance, knowing the
expected completion time of its dependent jobs. This was important because
in that environment we always had more jobs to run than can run on
available resources. It wasn't unusual to have 10s of 1000s of jobs waiting
in queue to run during the day.

Not sure if this helps the original question in this thread in any way.
But, I am glad to share my learning, if that helps.

Sharma


On Wed, May 13, 2015 at 1:12 PM, Tim St Clair <[email protected]> wrote:

> Hi Alex,
>
> Have you by chance integrated with any of the tradition batch DAG systems?
>
> http://pegasus.isi.edu/ , http://ccl.cse.nd.edu/software/makeflow/
>
> ​​
> I keep longing for folks with decades of experience in HTC&HPC to chime in
> "on-list".
>
> Subtle nudge ;-)
> Tim
>
> ------------------------------
>
> *From: *"Alex Gaudio" <[email protected]>
> *To: *[email protected]
> *Sent: *Wednesday, May 13, 2015 3:04:20 PM
>
> *Subject: *Re: Batch Scheduler with dependency support
>
> Hi Tim (and everyone else!),
>
> I am the primary author of Stolos.  We use Stolos to run all of our batch
> jobs on Mesos.  The batch jobs are scripts we can run from the
> command-line.  Scripts range from bash scripts, Spark jobs and R scripts.
>
> It's a great tool for us because, unlike Chronos, it lets us define a
> script as stage in a dependency chain, where the script can run with
> different parameters for different dependency contexts.  (The closest usage
> of this would be to have many Chronos servers, though this does not work in
> all cases).
>
> The tool is a critical component of Sailthru's data science
> infrastructure, but I believe we are the only people who use the tool right
> now.
>
> If you are interested in learning more, I'm happy to invest time to talk
> more about Stolos, what it does and how we use it!
>
> Alex
>
> On Wed, May 13, 2015 at 2:02 PM Tim Chen <[email protected]> wrote:
>
>> How are you running your batch jobs? Is the batch job script/executable
>> an in-house app?
>>
>> Tim
>>
>> On Wed, May 13, 2015 at 9:46 AM, Andras Kerekes <
>> [email protected]> wrote:
>>
>>> You might want to have a look at stolos too:
>>>
>>>
>>>
>>> https://github.com/sailthru/stolos
>>>
>>>
>>>
>>> Andras
>>>
>>>
>>>
>>>
>>>
>>> *From:* Aaron Carey [mailto:[email protected]]
>>> *Sent:* Wednesday, May 13, 2015 11:54 AM
>>> *To:* [email protected]
>>> *Subject:* RE: Batch Scheduler with dependency support
>>>
>>>
>>>
>>> Thanks! I hadn't come across that one before :)
>>> ------------------------------
>>>
>>> *From:* [email protected] [[email protected]] on behalf of Jeff
>>> Schroeder [[email protected]]
>>> *Sent:* 13 May 2015 16:39
>>> *To:* [email protected]
>>> *Subject:* Re: Batch Scheduler with dependency support
>>>
>>> Lookup Hubspot's Singularity
>>>
>>> On Wednesday, May 13, 2015, Aaron Carey <[email protected]> wrote:
>>>
>>> Thanks Jeff,
>>>
>>> Any other options around as well?
>>> ------------------------------
>>>
>>> *From:* [email protected] <http://UrlBlockedError.aspx> [
>>> [email protected] <http://UrlBlockedError.aspx>] on behalf of Jeff
>>> Schroeder [[email protected] <http://UrlBlockedError.aspx>]
>>> *Sent:* 13 May 2015 14:12
>>> *To:* [email protected] <http://UrlBlockedError.aspx>
>>> *Subject:* Batch Scheduler with dependency support
>>>
>>> It does both just as well, along with cron-like functionality. It is
>>> harder to install and takes a bit more understanding however. The official
>>> tutorial is a process that loops 100 times and then exits.
>>>
>>>
>>>
>>> http://aurora.apache.org/documentation/latest/tutorial/#the-script
>>>
>>> Aurora is pretty much a superset of most other generic frameworks sans
>>> maybe hubspot's singularity.
>>>
>>>
>>> On Wednesday, May 13, 2015, Aaron Carey <[email protected]
>>> <http://UrlBlockedError.aspx>> wrote:
>>>
>>> I was under the impression Aurora was for long running services? Is it
>>> suitable for scheduling one of batch processes too?
>>>
>>> thanks,
>>> Aaron
>>> ------------------------------
>>>
>>> *From:* [email protected] [[email protected]] on behalf of Jeff
>>> Schroeder [[email protected]]
>>> *Sent:* 13 May 2015 13:12
>>> *To:* [email protected]
>>> *Subject:* Re: Batch Scheduler with dependency support
>>>
>>> Apache Aurora does this and you can be explicit about the ordering
>>>
>>> On Wednesday, May 13, 2015, Aaron Carey <[email protected]> wrote:
>>>
>>> Hi All,
>>>
>>> I was just wondering if anyone out there knew of a good mesos batch
>>> scheduler which supports dependencies between tasks? (ie Task B cannot run
>>> until Task A is complete)
>>>
>>> Thanks,
>>> Aaron
>>>
>>>
>>>
>>> --
>>> Text by Jeff, typos by iPhone
>>>
>>>
>>>
>>> --
>>> Text by Jeff, typos by iPhone
>>>
>>>
>>>
>>> --
>>> Text by Jeff, typos by iPhone
>>>
>>
>>
>
>
> --
> Cheers,
> Timothy St. Clair
> Red Hat Inc.
>

Reply via email to