It looks like you are looking for something like BDS http://pcingola.github.io/BigDataScript/
It has the additional advantage that you can port your scripts seamlessly between Mesos and other cluster systems (SGE, PBS, Torque, etc.). On Wed, Oct 7, 2015 at 7:05 AM, F21 <f21.gro...@gmail.com> wrote: > I am also interested in something like this, although my requirements are > much more simpler. > > I am interested in a work queue like beanstalkd that will allow me to push > to a queue from a web app and have workers to do things like send emails, > generate pdfs and resize images. > > I have thought about running a beanstalkd in a container, but it has some > limitations. For example, if it crashes, it needs to be relaunched manually > to recover the binlog (which is a no go). > > Another option I can think of is to use kafka (which has a mesos > framework) and have the web app and other parts push jobs into the kafka > broker. Workers listening on the broker would pop each job off and execute > whatever needs to be done. > > However, there seems to be a lot of wheel-reinventing what that solution. > For example, what if a job depends on another job? There's also a lot of > work that needs to be done at a lower level when all I am interested in is > to write domain specific code to generate the pdf, resize the image etc. > > If there's a work queue solution for mesos, I would love to know too. > > > > > On 7/10/2015 8:08 PM, Brian Candler wrote: > > On 07/10/2015 09:44, Nikolaos Ballas neXus wrote: > > Maybe you need to read a bit :) > > I have read plenty, including those you list, and I didn't find anything > which met my requirements. Again I apologise if I was not clear in my > question. > > Spark has a very specific data model (RDDs) and applications which write > to its API. I want to run arbitrary compute jobs - think "shell scripts" or > "docker containers" which run pre-existing applications which I can't > change. And I want to fill a queue or pipeline with those jobs. > > Hadoop also is for specific workloads, written to run under Hadoop and > preferably using HDFS. > > The nearest Hadoop gets to general-purpose computing, as far as I can see, > is its YARN scheduler. YARN can in turn run under Mesos. Therefore a job > queue which can run on YARN might be acceptable, although I'd rather not > have an additional layer in the stack. (There was an old project for > running Torque under YARN, but this has been abandoned) > > Regards, > > Brian. > > >