Another great option is Cook: https://github.com/twosigma/Cook
Cook combines a simple REST API for batch jobs with sophisticated fair-sharing and preemption features on Mesos. Tomorrow, at MesosCon Europe, I'll be speaking about it in more detail. When we want to use dependencies with Cook, we use a workflow tool that creates the dependent jobs on-the-fly. On Wed, Oct 7, 2015 at 11:08 AM Pablo Cingolani <pablo.e.cingol...@gmail.com> wrote: > > It looks like you are looking for something like BDS > > http://pcingola.github.io/BigDataScript/ > > It has the additional advantage that you can port your scripts seamlessly > between Mesos and other cluster systems (SGE, PBS, Torque, etc.). > > > > > > On Wed, Oct 7, 2015 at 7:05 AM, F21 <f21.gro...@gmail.com> wrote: > >> I am also interested in something like this, although my requirements are >> much more simpler. >> >> I am interested in a work queue like beanstalkd that will allow me to >> push to a queue from a web app and have workers to do things like send >> emails, generate pdfs and resize images. >> >> I have thought about running a beanstalkd in a container, but it has some >> limitations. For example, if it crashes, it needs to be relaunched manually >> to recover the binlog (which is a no go). >> >> Another option I can think of is to use kafka (which has a mesos >> framework) and have the web app and other parts push jobs into the kafka >> broker. Workers listening on the broker would pop each job off and execute >> whatever needs to be done. >> >> However, there seems to be a lot of wheel-reinventing what that solution. >> For example, what if a job depends on another job? There's also a lot of >> work that needs to be done at a lower level when all I am interested in is >> to write domain specific code to generate the pdf, resize the image etc. >> >> If there's a work queue solution for mesos, I would love to know too. >> >> >> >> >> On 7/10/2015 8:08 PM, Brian Candler wrote: >> >> On 07/10/2015 09:44, Nikolaos Ballas neXus wrote: >> >> Maybe you need to read a bit :) >> >> I have read plenty, including those you list, and I didn't find anything >> which met my requirements. Again I apologise if I was not clear in my >> question. >> >> Spark has a very specific data model (RDDs) and applications which write >> to its API. I want to run arbitrary compute jobs - think "shell scripts" or >> "docker containers" which run pre-existing applications which I can't >> change. And I want to fill a queue or pipeline with those jobs. >> >> Hadoop also is for specific workloads, written to run under Hadoop and >> preferably using HDFS. >> >> The nearest Hadoop gets to general-purpose computing, as far as I can >> see, is its YARN scheduler. YARN can in turn run under Mesos. Therefore a >> job queue which can run on YARN might be acceptable, although I'd rather >> not have an additional layer in the stack. (There was an old project for >> running Torque under YARN, but this has been abandoned) >> >> Regards, >> >> Brian. >> >> >> >