Re: Batch/queue frameworks?

Nikolaos Ballas neXus Wed, 07 Oct 2015 03:29:31 -0700

I think any pub/sub system(name it typical jms / rabbitmq/ kafka) etc would do 
what you describe. All of them can be run as containers inside apache mess 
cluster. Kafka has really good integration with MEsos and YARN and also is more 
lightweight than a typical jus implementation.


regards
\n\m
On 07 Oct 2015, at 12:05, F21 
<f21.gro...@gmail.com<mailto:f21.gro...@gmail.com>> wrote:

I am also interested in something like this, although my requirements are much 
more simpler.

I am interested in a work queue like beanstalkd that will allow me to push to a 
queue from a web app and have workers to do things like send emails, generate 
pdfs and resize images.

I have thought about running a beanstalkd in a container, but it has some 
limitations. For example, if it crashes, it needs to be relaunched manually to 
recover the binlog (which is a no go).

Another option I can think of is to use kafka (which has a mesos framework) and 
have the web app and other parts push jobs into the kafka broker. Workers 
listening on the broker would pop each job off and execute whatever needs to be 
done.

However, there seems to be a lot of wheel-reinventing what that solution. For 
example, what if a job depends on another job? There's also a lot of work that 
needs to be done at a lower level when all I am interested in is to write 
domain specific code to generate the pdf, resize the image etc.

If there's a work queue solution for mesos, I would love to know too.



On 7/10/2015 8:08 PM, Brian Candler wrote:
On 07/10/2015 09:44, Nikolaos Ballas neXus wrote:
Maybe you need to read a bit  :)
I have read plenty, including those you list, and I didn't find anything which 
met my requirements. Again I apologise if I was not clear in my question.

Spark has a very specific data model (RDDs) and applications which write to its 
API. I want to run arbitrary compute jobs - think "shell scripts" or "docker 
containers" which run pre-existing applications which I can't change.  And I 
want to fill a queue or pipeline with those jobs.

Hadoop also is for specific workloads, written to run under Hadoop and 
preferably using HDFS.

The nearest Hadoop gets to general-purpose computing, as far as I can see, is 
its YARN scheduler. YARN can in turn run under Mesos. Therefore a job queue 
which can run on YARN might be acceptable, although I'd rather not have an 
additional layer in the stack. (There was an old project for running Torque 
under YARN, but this has been abandoned)

Regards,

Brian.



Nikolaos Ballas  |  Software Development Manager

Technology Nexus S.a.r.l.
2-4 Rue Eugene Rupert
2453 Luxembourg
Delivery address: 2-3 Rue Eugene Rupert,Vertigo Polaris Building
Tel: + 3522619113580
cont...@nexusgroup.com<mailto:contact...@nexusgroup.com> | 
nexusgroup.com<http://www.nexusgroup.com/>
LinkedIn.com<http://www.linkedin.com/company/nexus-technology> | 
Twitter<http://www.twitter.com/technologynexus> | 
Facebook.com<https://www.facebook.com/pages/Technology-Nexus/133756470003189>



[cid:87987ACD-6CF7-41BE-9517-E612DBF86ABA@pwcacc.com]
\

Re: Batch/queue frameworks?

Reply via email to