Come and try our Jenkin SGE Plugin,
https://github.com/jenkinsci/sge-cloud-plugin.

It has been performing well for us in our enterprise application.

John McGehee
Wave Computing

On Sat, Apr 23, 2016 at 8:12 AM, Dr. Mark Asbach <mark.asb...@pixolus.de>
wrote:

> Hi S(o)GE users,
>
> I need some advice :-)
>
> During my Ph.D. times, I discovered Sun Grid Engine and used it to run
> distributed machine learning jobs on a (then) medium sized cluster (96
> CPUs). I liked it. Now, a couple of years later, I am again looking for a
> scheduling and resource allocation system like SGE for a similar purpose.
> Unfortunately, SGE seems to be pretty dead. In addition, I have similar but
> not identical needs stemming from continuous integration and from running
> (micro-)web services. Ideally, I would like a simple, integrated solution
> and not a complex monster built from many large parts.
>
> Here's what I'm trying to accomplish:
>
> - Run custom jobs for machine learning / data analysis. When I have an
> idea, I write a job and run it. Usually, the same job is only run a few
> times. Jobs will span multiple hosts and might require OpenMP + MPI. This
> is where SGE was really good in the past. The crowd seems to have shifted
> to run everything on Hadoop although this setup would be really ineffective
> for my purposes. I usually just need a couple of CPUs (< 100).
>
> - Run frequent identical jobs for continous integration. We have a Jenkins
> running, but it is lacking in some regards. Resource allocation and
> scheduling is more or less non-existent. For example, I cannot define
> resources for things like attached mobile devices that can be used only by
> one job of a multi-core Mac at the same time. These are things already
> solved with SGE, but SGE itself does not cover the main aspects of CI, i.e.
> the collection and analysis of the build data.
>
> - Run (micro-)services. We have a couple of services that need run
> continuously. Some need to be scaled up and down regarding the number of
> parallel instances. This is where people are now using Docker and (also
> quite complex) resource allocation and scheduling systems like kubernetes.
>
> All three sorts of tasks compete for the same resources and suffer the
> same problem of provisioning/configuring the workers to fulfill a job's
> requirements. We're using Vagrant + ansible to provision VMs for our
> machine learning tasks and I would like to extend this to the other
> problems as well. The resource allocation is still somewhat manual in our
> case. I would really like to cut down the complexity of our setup.
>
> It would be great if you can point to me any helpful information, ideas,
> projects that could help me solve this.
>
> Best,
> Mark
> _______________________________________________
> users mailing list
> users@gridengine.org
> https://gridengine.org/mailman/listinfo/users
>
>
_______________________________________________
users mailing list
users@gridengine.org
https://gridengine.org/mailman/listinfo/users

Reply via email to