On 07/10/2015 09:44, Nikolaos Ballas neXus wrote:
Maybe you need to read a bit  :)
I have read plenty, including those you list, and I didn't find anything which met my requirements. Again I apologise if I was not clear in my question.

Spark has a very specific data model (RDDs) and applications which write to its API. I want to run arbitrary compute jobs - think "shell scripts" or "docker containers" which run pre-existing applications which I can't change. And I want to fill a queue or pipeline with those jobs.

Hadoop also is for specific workloads, written to run under Hadoop and preferably using HDFS.

The nearest Hadoop gets to general-purpose computing, as far as I can see, is its YARN scheduler. YARN can in turn run under Mesos. Therefore a job queue which can run on YARN might be acceptable, although I'd rather not have an additional layer in the stack. (There was an old project for running Torque under YARN, but this has been abandoned)

Regards,

Brian.

Reply via email to