Re: [galaxy-dev] workflow representation and execution...

2011-03-30 Thread Dannon Baker
Hi Kostas,

Workflows are saved in the database.  If you're looking for an external 
representation you can generate one by going to 'Download or Export' in an 
individual workflow's menu in the main workflows list.  The download option 
there gives a JSON representation of the workflow that can also be imported 
into other Galaxies.

As far as where the jobs will run, it is definitely possible that multiple jobs 
in a single workflow will execute on different nodes.

-Dannon


On Mar 30, 2011, at 5:21 AM, Kostas Karasavvas wrote:

 Hi all!
 
 I have two questions:
 
 1) Where are workflows saved in the filesystem? Are they saved as xml
 files (or something) that describes the structure of the
 workflow/pipeline? (e.g. tools to run, parameters, etc.)
 
 2) When a cluster is used, does a complete workflow run on one node or
 is it possible that multiple tools in the same workflow are executed
 in different nodes?
 
 Thank you in advance!
 Kostas
 ___
 Please keep all replies on the list by using reply all
 in your mail client.  To manage your subscriptions to this
 and other Galaxy lists, please use the interface at:
 
  http://lists.bx.psu.edu/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] workflow representation and execution...

2011-03-30 Thread Kostas Karasavvas
Hi Glen,

 Good to know. I assume that currently that is not happening though,
 right? If yes, is it in your immediate plans?


 It should happen now - each step in the workflow would get submitted as a 
 separate job to the cluster.  At that point it is up to the job scheduler and 
 not Galaxy to determine where the job is run.

Ah, great. So if two connected tasks run on different nodes how is
data movement between these nodes handled?  Is there a distr. file
system used to take care of that or it's being done manually?

Thanks you!
Kostas

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] workflow representation and execution...

2011-03-30 Thread Glen Beane

On Mar 30, 2011, at 8:34 AM, Kostas Karasavvas wrote:

 Hi Glen,
 
 Good to know. I assume that currently that is not happening though,
 right? If yes, is it in your immediate plans?
 
 
 It should happen now - each step in the workflow would get submitted as a 
 separate job to the cluster.  At that point it is up to the job scheduler 
 and not Galaxy to determine where the job is run.
 
 Ah, great. So if two connected tasks run on different nodes how is
 data movement between these nodes handled?  Is there a distr. file
 system used to take care of that or it's being done manually?
 
 Thanks you!
 Kostas



the job should copy its output files into the Galaxy database/files directory 
before it terminates.  So assuming you are not using data staging and the 
database directory is shared via NFS, then the next tool will read the files 
from this location.  There is an option to stage files if you use TORQUE, but I 
think most sites use the network shared storage approach.


--
Glen L. Beane
Senior Software Engineer
The Jackson Laboratory
(207) 288-6153





___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-dev] workflow representation and execution...

2011-03-30 Thread Kostas Karasavvas
Hi Glen,

All is clear now. Thanks for the info! :)

K.


 the job should copy its output files into the Galaxy database/files directory 
 before it terminates.  So assuming you are not using data staging and the 
 database directory is shared via NFS, then the next tool will read the files 
 from this location.  There is an option to stage files if you use TORQUE, but 
 I think most sites use the network shared storage approach.

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/