Re: [galaxy-dev] Request: Option to reduce server data transfer for big workflow in cluster

Ben Gift Tue, 17 Dec 2013 09:53:46 -0800

Hi John, thanks for the reply.

Yes, I mean Galaxy's default behavior of keeping all the data on all nodes
of our condor cluster. So for instance if I run a job, then the output of
that job is copied to every node in the cluster. Is this not the normal
behavior?



On Tue, Dec 17, 2013 at 9:42 AM, John Chilton <[email protected]> wrote:

> Hey Ben,
>
> Thanks for the e-mail. I did not promise anything was coming soon, I
> only said people were working on parts of it. It is not a feature yet
> unfortunately - multiple people including myself are thinking about
> various parts of this problem though.
>
> I would like to respond, but I am trying to understand this line: "We
> can't do this because Galaxy copies all intermediate steps to all
> no(d)es, which would bog down the servers too much."
>
> Can you describe how you are doing this staging for me? Is data
> currently being copied around to all the nodes, if so how are you
> doing that? Or are you trying to say that Galaxy requires the data to
> be available on all of the nodes?
>
> -John
>
> On Tue, Dec 17, 2013 at 11:15 AM, Ben Gift <[email protected]> wrote:
> > We've run into a scenario lately where we need to run a very large
> workflow
> > (huge data in intermediate steps) many times. We can't do this because
> > Galaxy copies all intermediate steps to all notes, which would bog down
> the
> > servers too much.
> >
> > I asked about something similar before and John mentioned the feature to
> > automatically delete intermediate step data in a workflow once it
> completed,
> > was coming soon. Is that a feature now? That would help.
> >
> > Ultimately though we can't be copying all this data around to all nodes.
> The
> > network just isn't good enough, so I have an idea.
> >
> > What if we have an option on the 'run workflow' screen to only run on one
> > node (eliminating the neat Galaxy concurrency ability for that workflow
> > unfortunately)? Then it just propagates the final step data.
> >
> > Or maybe only copy to a couple other nodes, to keep concurrency.
> >
> > If the job errored then in this case I think it should just throw out all
> > the data, or propagate where it stopped.
> >
> > I've been trying to work on implementing this myself but it's taking me a
> > long time. I only just started understanding the pyramid stack, and am
> > putting in the checkbox in the run.mako template. I still need to learn
> the
> > database schema, message passing, and how jobs are stored, and how to
> tell
> > condor to only use 1 node, (and more I'm sure) in Galaxy. (I'm drowning)
> >
> > This seems like a really important feature though as Galaxy gains more
> > traction as a research tool for bigger projects that demand working with
> > huge data, and running huge workflows many many times.
> >
> > ___________________________________________________________
> > Please keep all replies on the list by using "reply all"
> > in your mail client.  To manage your subscriptions to this
> > and other Galaxy lists, please use the interface at:
> >   http://lists.bx.psu.edu/
> >
> > To search Galaxy mailing lists use the unified search at:
> >   http://galaxyproject.org/search/mailinglists/
>

___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Request: Option to reduce server data transfer for big workflow in cluster

Reply via email to