Glad to see someone else is playing around with Mesos.
I have a mesos branch that is getting a little long in the tooth. I'd like
to get a straight job runner (non-LWR, with a shared file system) running
under mesos for Galaxy before I submit that work for a pull request.
The hackathon is only 12
Hey Kyle, all,
If anyone wants to play with running Galaxy jobs within an Apache
Mesos environment I have added a prototype of this feature to the LWR.
https://bitbucket.org/jmchilton/lwr/commits/555438d2fe266899338474b25c540fef42bcece7
Hi Kyle,
Swift indeed is a complete framework for distributed computing.
Distributing files out to cluster nodes, starting processes, bringing back
result files to submit host is done out of the box (stagein-exec-stageout
cycle).
We can discuss offline if you are interested in giving it a shot.
I don't think implementation will be very difficult. The bigger question is
this a technology people are open to?
The nearest competitor is YARN (
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YARN.html).
Mesos seems a bit more geared toward general purpose usage (with several
You probably are a good person to get an opinion from. My plan isn't to
write new frameworks, but rather use existing libraries that can
communicate with Mesos to setup their parallel environments.
But for Swift, you would probably want to write a new framework. Just
looking at Swift, I imagine
I think one of the aspects where Galaxy is a bit soft is the ability to do
distributed tasks. The current system of split/replicate/merge tasks based
on file type is a bit limited and hard for tool developers to expand upon.
Distributed computing is a non-trival thing to implement and I think it