[galaxy-dev] Some questions about cloudman

2011-10-28 Thread Cittaro Davide
Hi there, I'm in the middle of a decision: should I go into the cloud or not? I'm reading the docs on galaxy wiki, and I see that besides EC2, EBS I need S3 storage. What is that for (meaning: why galaxy needs S3)? @people already using it: how to you send NGS data? I need some feedback! :-)

Re: [galaxy-dev] Some questions about cloudman

2011-10-28 Thread Cittaro Davide
On Oct 28, 2011, at 5:35 PM, James Taylor wrote: It currently uses a tiny amount of S3 storage just to save configuration information about your instance. Ok.. never used AWS, actually, I didn't know S3 holds the information. I guess I will have to read some how-to Long term though we plan

[galaxy-dev] Best practices with data on clusters

2011-12-20 Thread Cittaro Davide
Hi developers, I have a question that may be an OT, but since galaxy can work in a clustered environment withh queueing system, I'll try to ask here. Is there anibody here who copies data in a local temporary directory before performing any analysis step and copy it back into the final results?

Re: [galaxy-dev] Best practices with data on clusters

2012-01-04 Thread Cittaro Davide
Hi Nate, On Jan 3, 2012, at 10:15 PM, Nate Coraor wrote: That said, if you have a lot interim steps that produce large data that then get merged via some process back to final outputs, it absolutely makes sense to use local disk for those steps (assuming local disk is large enough - another

Re: [galaxy-dev] Interested in speaking with other institutions deploying Galaxy locally?

2012-04-27 Thread Cittaro Davide
Here's another interested! On Apr 27, 2012, at 8:45 PM, Ann Black-Ziegelbein wrote: Hi everyone - Here at the University of Iowa we are working on deploying Galaxy locally for campus wide access. I am interested in forming a community of other institutions trying to deploy Galaxy locally and

[galaxy-dev] mapreduce approach in workflows

2012-06-11 Thread Cittaro Davide
Hi all, We are trying to write our own GATK workflow (although I guess an official one will be released...) and I realized I have to parallelize execution as some steps are painfully slow. In order to run GATK quickly I need to split the whole process on specific genomic intervals (-L option)

[galaxy-dev] interactive debug of a tool

2012-06-11 Thread Cittaro Davide
Hi all, I remember there was be a way to run a debug shell on tools that fail to run in galaxy. Unfortunately I have a bad memory and google is not helping: how can I run it? d /* Davide Cittaro, PhD Coordinator of Bioinformatics Core Center for Translational Genomics and Bioinformatics San