Hi Everyone, I've just sent a mail to the list introducing BDS, my simple scripting language that runs on Mesos:
http://pcingola.github.io/BigDataScript/ @Aaron: It looks like BDS should fit what you are looking for. BDS is designed for complex pipelines with many dependencies (at least that's what most people are using it for). @Dougals and @Sharma: BDS also has the same structure where dependencies are resolved and then sent to the scheduler. Please let me know what you think. Yours Pablo On Thu, May 14, 2015 at 12:45 PM, Sharma Podila <[email protected]> wrote: > Hi Aaron, yeah, an in-house one. > > > On Thu, May 14, 2015 at 1:09 AM, Aaron Carey <[email protected]> wrote: > >> Hi Sharma, >> >> This sounds eerily familiar! Was this an in-house system you were working >> on, or a commercial product? >> >> Thanks, >> Aaron >> >> ------------------------------ >> *From:* Sharma Podila [[email protected]] >> *Sent:* 13 May 2015 23:49 >> *To:* [email protected] >> *Cc:* Douglas Thain; Brian Bockelman >> *Subject:* [Junk released by User action] Re: Batch Scheduler with >> dependency support >> >> I keep longing for folks with decades of experience in HTC&HPC to >>> chime in "on-list". >> >> >> FWIW, I come from that background, but, am not in that space at this >> time. My prior life was in developing a (not open source) distributed job >> scheduler and management system for batch and interactive jobs that handled >> dependencies, deadlines, preemptions, advance reservation of resources, >> etc. with multi-level priority and share tree hierarchy based allocation. >> Typically, dependencies and deadlines are handled outside of schedulers and >> fed into schedulers as task submission after dependencies have been met. We >> found it more optimal to have the scheduler resolve dependencies and >> deadlines inherently. This way, a high priority job dependent on another >> low priority job can induce higher priority on that dependent job. >> Similarly, a job with a deadline depending on another job's completion can >> induce an earlier launch of the latter job in order to meet it's deadline. >> Also, a dependent job can reserve its resources in advance, knowing the >> expected completion time of its dependent jobs. This was important because >> in that environment we always had more jobs to run than can run on >> available resources. It wasn't unusual to have 10s of 1000s of jobs waiting >> in queue to run during the day. >> >> Not sure if this helps the original question in this thread in any way. >> But, I am glad to share my learning, if that helps. >> >> Sharma >> >> >> On Wed, May 13, 2015 at 1:12 PM, Tim St Clair <[email protected]> >> wrote: >> >>> Hi Alex, >>> >>> Have you by chance integrated with any of the tradition batch DAG >>> systems? >>> >>> http://pegasus.isi.edu/ , http://ccl.cse.nd.edu/software/makeflow/ >>> >>> >>> I keep longing for folks with decades of experience in HTC&HPC to chime >>> in "on-list". >>> >>> Subtle nudge ;-) >>> Tim >>> >>> ------------------------------ >>> >>> *From: *"Alex Gaudio" <[email protected]> >>> *To: *[email protected] >>> *Sent: *Wednesday, May 13, 2015 3:04:20 PM >>> >>> *Subject: *Re: Batch Scheduler with dependency support >>> >>> Hi Tim (and everyone else!), >>> >>> I am the primary author of Stolos. We use Stolos to run all of our >>> batch jobs on Mesos. The batch jobs are scripts we can run from the >>> command-line. Scripts range from bash scripts, Spark jobs and R scripts. >>> >>> It's a great tool for us because, unlike Chronos, it lets us define a >>> script as stage in a dependency chain, where the script can run with >>> different parameters for different dependency contexts. (The closest usage >>> of this would be to have many Chronos servers, though this does not work in >>> all cases). >>> >>> The tool is a critical component of Sailthru's data science >>> infrastructure, but I believe we are the only people who use the tool right >>> now. >>> >>> If you are interested in learning more, I'm happy to invest time to >>> talk more about Stolos, what it does and how we use it! >>> >>> Alex >>> >>> On Wed, May 13, 2015 at 2:02 PM Tim Chen <[email protected]> wrote: >>> >>>> How are you running your batch jobs? Is the batch job script/executable >>>> an in-house app? >>>> >>>> Tim >>>> >>>> On Wed, May 13, 2015 at 9:46 AM, Andras Kerekes < >>>> [email protected]> wrote: >>>> >>>>> You might want to have a look at stolos too: >>>>> >>>>> >>>>> >>>>> https://github.com/sailthru/stolos >>>>> >>>>> >>>>> >>>>> Andras >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> *From:* Aaron Carey [mailto:[email protected]] >>>>> *Sent:* Wednesday, May 13, 2015 11:54 AM >>>>> *To:* [email protected] >>>>> *Subject:* RE: Batch Scheduler with dependency support >>>>> >>>>> >>>>> >>>>> Thanks! I hadn't come across that one before :) >>>>> ------------------------------ >>>>> >>>>> *From:* [email protected] [[email protected]] on behalf of >>>>> Jeff Schroeder [[email protected]] >>>>> *Sent:* 13 May 2015 16:39 >>>>> *To:* [email protected] >>>>> *Subject:* Re: Batch Scheduler with dependency support >>>>> >>>>> Lookup Hubspot's Singularity >>>>> >>>>> On Wednesday, May 13, 2015, Aaron Carey <[email protected]> wrote: >>>>> >>>>> Thanks Jeff, >>>>> >>>>> Any other options around as well? >>>>> ------------------------------ >>>>> >>>>> *From:* [email protected] <http://UrlBlockedError.aspx> [ >>>>> [email protected] <http://UrlBlockedError.aspx>] on behalf of >>>>> Jeff Schroeder [[email protected] >>>>> <http://UrlBlockedError.aspx>] >>>>> *Sent:* 13 May 2015 14:12 >>>>> *To:* [email protected] <http://UrlBlockedError.aspx> >>>>> *Subject:* Batch Scheduler with dependency support >>>>> >>>>> It does both just as well, along with cron-like functionality. It is >>>>> harder to install and takes a bit more understanding however. The official >>>>> tutorial is a process that loops 100 times and then exits. >>>>> >>>>> >>>>> >>>>> http://aurora.apache.org/documentation/latest/tutorial/#the-script >>>>> >>>>> Aurora is pretty much a superset of most other generic frameworks >>>>> sans maybe hubspot's singularity. >>>>> >>>>> >>>>> On Wednesday, May 13, 2015, Aaron Carey <[email protected] >>>>> <http://UrlBlockedError.aspx>> wrote: >>>>> >>>>> I was under the impression Aurora was for long running services? Is it >>>>> suitable for scheduling one of batch processes too? >>>>> >>>>> thanks, >>>>> Aaron >>>>> ------------------------------ >>>>> >>>>> *From:* [email protected] [[email protected]] on behalf of >>>>> Jeff Schroeder [[email protected]] >>>>> *Sent:* 13 May 2015 13:12 >>>>> *To:* [email protected] >>>>> *Subject:* Re: Batch Scheduler with dependency support >>>>> >>>>> Apache Aurora does this and you can be explicit about the ordering >>>>> >>>>> On Wednesday, May 13, 2015, Aaron Carey <[email protected]> wrote: >>>>> >>>>> Hi All, >>>>> >>>>> I was just wondering if anyone out there knew of a good mesos batch >>>>> scheduler which supports dependencies between tasks? (ie Task B cannot run >>>>> until Task A is complete) >>>>> >>>>> Thanks, >>>>> Aaron >>>>> >>>>> >>>>> >>>>> -- >>>>> Text by Jeff, typos by iPhone >>>>> >>>>> >>>>> >>>>> -- >>>>> Text by Jeff, typos by iPhone >>>>> >>>>> >>>>> >>>>> -- >>>>> Text by Jeff, typos by iPhone >>>>> >>>> >>>> >>> >>> >>> -- >>> Cheers, >>> Timothy St. Clair >>> Red Hat Inc. >>> >> >> >

