It looks like you are looking for something like BDS

  http://pcingola.github.io/BigDataScript/

It has the additional advantage that you can port your scripts seamlessly
between Mesos and other cluster systems (SGE, PBS, Torque, etc.).





On Wed, Oct 7, 2015 at 7:05 AM, F21 <f21.gro...@gmail.com> wrote:

> I am also interested in something like this, although my requirements are
> much more simpler.
>
> I am interested in a work queue like beanstalkd that will allow me to push
> to a queue from a web app and have workers to do things like send emails,
> generate pdfs and resize images.
>
> I have thought about running a beanstalkd in a container, but it has some
> limitations. For example, if it crashes, it needs to be relaunched manually
> to recover the binlog (which is a no go).
>
> Another option I can think of is to use kafka (which has a mesos
> framework) and have the web app and other parts push jobs into the kafka
> broker. Workers listening on the broker would pop each job off and execute
> whatever needs to be done.
>
> However, there seems to be a lot of wheel-reinventing what that solution.
> For example, what if a job depends on another job? There's also a lot of
> work that needs to be done at a lower level when all I am interested in is
> to write domain specific code to generate the pdf, resize the image etc.
>
> If there's a work queue solution for mesos, I would love to know too.
>
>
>
>
> On 7/10/2015 8:08 PM, Brian Candler wrote:
>
> On 07/10/2015 09:44, Nikolaos Ballas neXus wrote:
>
> Maybe you need to read a bit  :)
>
> I have read plenty, including those you list, and I didn't find anything
> which met my requirements. Again I apologise if I was not clear in my
> question.
>
> Spark has a very specific data model (RDDs) and applications which write
> to its API. I want to run arbitrary compute jobs - think "shell scripts" or
> "docker containers" which run pre-existing applications which I can't
> change.  And I want to fill a queue or pipeline with those jobs.
>
> Hadoop also is for specific workloads, written to run under Hadoop and
> preferably using HDFS.
>
> The nearest Hadoop gets to general-purpose computing, as far as I can see,
> is its YARN scheduler. YARN can in turn run under Mesos. Therefore a job
> queue which can run on YARN might be acceptable, although I'd rather not
> have an additional layer in the stack. (There was an old project for
> running Torque under YARN, but this has been abandoned)
>
> Regards,
>
> Brian.
>
>
>

Reply via email to