Getting up to speed on Galaxy and couldn't find examples or discussion related 
to the architecture and was hoping an expert could give some quick 
pointers/guidance.

Where do I find info if the installed applications make use of multiple nodes 
via MPI(etc) which would indicate the benefit of starting up X number of nodes 
for faster processing?

If a workflow has multiple initial inputs for say processing NGS exome data 
from tumor and blood(gets compared later in the workflow) will each step get 
sent to a different node(without a dependency) or will the entire workflow run 
on one node?

If I have NGS data for 20 patients sitting in a S3 bucket and want a specific 
workflow run against each patient data input(s) does this require manual 
selection of files by a user or can the workflow be automated?

Can I programmatically start a workflow remotely(via REST) where I have 
automated the process of uploading NGS data to S3 and know the input file(s) 
per workflow?

Is it possible to present credentials in a workflow for downloading a file via 
S3 where I require authentication before a file can be downloaded? Working with 
NGS data for patients so trying to understand how I can keep security tight. 
Currently planning on restricting download to IP address for the cluster but 
gets a little complicated for what amazon is doing behind the scenes in its 
internal network.

I would also like to push results/output back to S3 and didn't see anything 
obvious to do this. Gets a little complicated in that you would need to 
probably put results back in the same S3 bucket/new folder where the original 
source files came from. I saw mention of using scp to move files but that 
doesn't help to put results back in S3.


So far I really like what I have seen and hope Galaxy becomes the future 
toolbox for our work.

Does a roadmap exist for what is planned in the future? For example any 
additional tools NGS tools like Abyss going to make into the build? Interested 
in NGS software that handles the dynamics of cancer for gene fusion events, 
CNVs(etc) when dealing with NGS data.

Thanks

Scooter










___________________________________________________________
Please keep all replies on the list by using "reply all"
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:

  http://lists.bx.psu.edu/

Reply via email to