Re: Batch/queue frameworks?

David Greenberg Wed, 07 Oct 2015 03:12:59 -0700

Another great option is Cook: https://github.com/twosigma/Cook


Cook combines a simple REST API for batch jobs with sophisticated
fair-sharing and preemption features on Mesos. Tomorrow, at MesosCon
Europe, I'll be speaking about it in more detail. When we want to use
dependencies with Cook, we use a workflow tool that creates the dependent
jobs on-the-fly.

On Wed, Oct 7, 2015 at 11:08 AM Pablo Cingolani <pablo.e.cingol...@gmail.com>
wrote:

>
> It looks like you are looking for something like BDS
>
>   http://pcingola.github.io/BigDataScript/
>
> It has the additional advantage that you can port your scripts seamlessly
> between Mesos and other cluster systems (SGE, PBS, Torque, etc.).
>
>
>
>
>
> On Wed, Oct 7, 2015 at 7:05 AM, F21 <f21.gro...@gmail.com> wrote:
>
>> I am also interested in something like this, although my requirements are
>> much more simpler.
>>
>> I am interested in a work queue like beanstalkd that will allow me to
>> push to a queue from a web app and have workers to do things like send
>> emails, generate pdfs and resize images.
>>
>> I have thought about running a beanstalkd in a container, but it has some
>> limitations. For example, if it crashes, it needs to be relaunched manually
>> to recover the binlog (which is a no go).
>>
>> Another option I can think of is to use kafka (which has a mesos
>> framework) and have the web app and other parts push jobs into the kafka
>> broker. Workers listening on the broker would pop each job off and execute
>> whatever needs to be done.
>>
>> However, there seems to be a lot of wheel-reinventing what that solution.
>> For example, what if a job depends on another job? There's also a lot of
>> work that needs to be done at a lower level when all I am interested in is
>> to write domain specific code to generate the pdf, resize the image etc.
>>
>> If there's a work queue solution for mesos, I would love to know too.
>>
>>
>>
>>
>> On 7/10/2015 8:08 PM, Brian Candler wrote:
>>
>> On 07/10/2015 09:44, Nikolaos Ballas neXus wrote:
>>
>> Maybe you need to read a bit  :)
>>
>> I have read plenty, including those you list, and I didn't find anything
>> which met my requirements. Again I apologise if I was not clear in my
>> question.
>>
>> Spark has a very specific data model (RDDs) and applications which write
>> to its API. I want to run arbitrary compute jobs - think "shell scripts" or
>> "docker containers" which run pre-existing applications which I can't
>> change.  And I want to fill a queue or pipeline with those jobs.
>>
>> Hadoop also is for specific workloads, written to run under Hadoop and
>> preferably using HDFS.
>>
>> The nearest Hadoop gets to general-purpose computing, as far as I can
>> see, is its YARN scheduler. YARN can in turn run under Mesos. Therefore a
>> job queue which can run on YARN might be acceptable, although I'd rather
>> not have an additional layer in the stack. (There was an old project for
>> running Torque under YARN, but this has been abandoned)
>>
>> Regards,
>>
>> Brian.
>>
>>
>>
>

Re: Batch/queue frameworks?

Reply via email to