Re: [galaxy-dev] Automatically removing items from history
I haven't had a chance to do anything on this yet, but I'll see if I can work something out in the near future. -Dannon On Sep 7, 2011, at 9:34 PM, Glen Beane wrote: On Sep 7, 2011, at 8:10 PM, Edward Kirton wrote: i'm resurrecting this thread to see if there's any more support for the idea of deleting intermediate files in a workflow. i think this is an important feature to have. oftentimes a workflow creates many intermediate files no one will ever look at. and leaving it up to the user to cleanup their data files is asking too much. there's another ticket regarding allow users to still be able to preview the metadata of deleted workflow history items and together these would go together nicely. I am _very_ interested in this feature -- Glen L. Beane Senior Software Engineer The Jackson Laboratory (207) 288-6153 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] trannsfer files from remote system to galaxy
Hi, In my local instance of Galaxy ,I want to add one option in which i can get files from remote system to galaxy in data library . except url is there any option to get remote files from galaxy ? Regards ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] Tophat non Sanger input
Dear Stephen (and others): The sole reason for requiring fastq-sanger input to all of our wrappers was to force the users to run their data through the groomer. It is slow, but it checks data consistency in a way that is more robust than just checking 'four lines per fastq block' and prevents a lot of problems downstream. Here on Galaxy @ Penn State we see a lot of fastq files edited in MS Word and other similar horrors, which are being caught by groomer and prevent users from running into problems later on (and so cutting down on the support overhead - investigating why groomer has failed is a lot easier than researching why a particular set of polymorphisms derived from a Word-edited fastq file clusters Ukrainians with parasitic worms). In addition, even though Illumina did switch to Sanger encoding, there is still a lot of old data out there. However, we are open to suggestions ... What we are thinking of lately is switching to unaligned BAM for everyting. One of the benefits here is the ability to add readgroups from day 1 simplifying multisample analyses down the road. a. Anton Nekrutenko http://galaxyproject.org On Sep 8, 2011, at 10:14 AM, Stephen Taylor wrote: On 08/09/2011 14:17, Hans-Rudolf Hotz wrote: On 09/08/2011 09:47 AM, Stephen Taylor wrote: On 07/09/2011 20:22, Edward Kirton wrote: seems unnecessary since illumina switched over to fastqsanger now. http://www.illumina.com/truseq/quality_101/quality_scores.ilmn Eventually...unfortunately we still get a lot of fastqillumina :-( I might miss your point.but why can't you use the fastq groomer tool? - Duplication of data (disk space usage) - Groomer is slow and puts more demands on CPU usage where it can be done easily on the fly by tophat - Consistency (bowtie does it) From the responses (or lack of :-)) we've been spurred on to change the wrapper. If there is interest we will commit it to the code base when done. Cheers, Steve ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
[galaxy-dev] Server down?
Hi, I am user of Galaxy Test. I wonder is the Galaxy Test server down these 2 days? As the Tophat and Cufflinks job running take me a lot of time to run compared to previous? Thanks. Best regards, Crystal ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] cuffcompare wrapper
It's not an issue in the tmp dir but in the job_working_directory. I run most of those other tools with no problems. I don't think we should make it a requirement across the board and I think we can come up with alternative ways to clean up the job_working_directory. I am hoping that you could add the symlink to the cuffcompare wrapper as it is the only one where the symlink causes me a problem as far as I have tested. We don't want to have our code base differ to much from galaxy-central. Thanks, Ilya From: Jeremy Goecks [mailto:jeremy.goe...@emory.edu] Sent: Wednesday, September 07, 2011 6:26 PM To: Chorny, Ilya Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] cuffcompare wrapper Ilya, A search of the Galaxy codebase indicates that thirteen tools use symlinks (e.g. GATK, Sicer, Picard, Cuff*, Bowtie), so the changes required to support this new code are significant. (Changes would also likely be needed for tools in the tool shed.) Also, asking tool wrappers to delete symlinks would be an idiosyncratic requirement as tools assume they have a temporary working directory at their disposal. For these reasons, it seems best to have the tool framework clean up symlinks as necessary to support the new code. Best, J. On Sep 7, 2011, at 2:28 PM, Chorny, Ilya wrote: Ok, I figured out why you need the symlink. Can you add an unlink after the process completes? i.e for i, arg in enumerate( args ): input_file_name = ./input%i % ( i+1 ) os.unlink(input_file_name) From: galaxy-dev-boun...@lists.bx.psu.edumailto:galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu]mailto:[mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Wednesday, September 07, 2011 9:18 AM To: galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] cuffcompare wrapper Hi Jeremy, The symlink in the cuffcompare wrapper was causing galaxy to crash because I run as the actual user and have to chmod the job_working directory at the end so Galaxy can clean it up. Turns out is seems like the symlink is not needed. Am I missing something. See below. Your code: for i, arg in enumerate( args ): input_file_name = ./input%i % ( i+1 ) os.symlink( arg, input_file_name ) cmd += %s % input_file_name My code: for i, arg in enumerate( args ): cmd += arg Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: icho...@illumina.commailto:icho...@illumina.com Website: www.illumina.comhttp://www.illumina.com/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] macs in galaxy
Is there a python script associated with the macs.xml file? From: galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Kanwei Li Sent: Wednesday, August 24, 2011 5:41 PM To: KOH Jia Yu Jayce Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] macs in galaxy You can use the MACS wrappers here (for 1.4): https://bitbucket.org/cistrome/cistrome-harvard/src/779d208c2cbd/tools/peakcalling/ Until we officially add it to our distribution. Thanks, K On Wed, Aug 24, 2011 at 8:32 PM, KOH Jia Yu Jayce ko...@gis.a-star.edu.sgmailto:ko...@gis.a-star.edu.sg wrote: Yes. Thank you for your help ☺ -Original Message- From: Kanwei Li [mailto:kan...@gmail.commailto:kan...@gmail.com] Sent: Wednesday, August 24, 2011 11:28 PM To: KOH Jia Yu Jayce Cc: galaxy-dev@lists.bx.psu.edumailto:galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] macs in galaxy Hi Jayce, Are you running this on your local instance? It seems you are running MACS 1.4, which our wrapper does not support yet, but we are planning to add a wrapper for 1.4 soon. Thanks, K On Wed, Aug 24, 2011 at 4:20 AM, KOH Jia Yu Jayce ko...@gis.a-star.edu.sgmailto:ko...@gis.a-star.edu.sg wrote: In running macs in galaxy, the following error was found ERROR:root:mfold format error! Your input is '32'. It should be like '10,30' A format for mfold like 10,30 is expected… but the default value configured in xml remains as 32. will there be an updated version of this xml in future? Also after altering the default display mfold value to 10,30, type integer in the param tag for mfold become erroneous. May I ask what is the correct type for input format 10,30? Thanks alot ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] disk space and file formats
The use of (unaligned) BAM for readgroups seems like a good idea. At the very least it prevents inconsistently hacking this information into the FASTQ descriptor (a common problem with any simple format). chris On Sep 8, 2011, at 1:35 PM, Edward Kirton wrote: copied from another thread: On Thu, Sep 8, 2011 at 7:30 AM, Anton Nekrutenko an...@bx.psu.edu wrote: What we are thinking of lately is switching to unaligned BAM for everyting. One of the benefits here is the ability to add readgroups from day 1 simplifying multisample analyses down the road. this seems to be the simplest solution; i like it a lot. really, only the reads need to be compressed, most other outfiles are tiny by comparison, so a more general solution may be overkill. and if compression of everything is desired, zfs works well -- another of our sites (LANL) uses this and recommended it to me too. i just haven't been able to convince my own IT people to go this route for technical reason beyond my attention span. On Tue, Sep 6, 2011 at 9:05 AM, Peter Cock p.j.a.c...@googlemail.com wrote: On Tue, Sep 6, 2011 at 5:00 PM, Nate Coraor n...@bx.psu.edu wrote: Peter Cock wrote: On Tue, Sep 6, 2011 at 3:24 PM, Nate Coraor n...@bx.psu.edu wrote: Ideally, there'd just be a column on the dataset table indicating whether the dataset is compressed or not, and then tools get a new way to indicate whether they can directly read compressed inputs, or whether the input needs to be decompressed first. --nate Yes, that's what I was envisioning Nate. Are there any schemes other than gzip which would make sense? Perhaps rather than a boolean column (compressed or not), it should specify the kind of compression if any (e.g. gzip). Makes sense. We need something which balances compression efficiency (size) with decompression speed, while also being widely supported in libraries for maximum tool uptake. Yes, and there's a side effect of allowing this: you may decrease efficiency if the tools used downstream all require decompression, and you waste a bunch of time decompressing the dataset multiple times. While decompression wastes CPU time and makes things slower, there is less data IO from disk (which may be network mounted) which makes things faster. So overall, depending on the setup and the task at hand, it could be faster. Is it time to file an issue on bitbucket to track this potential enhancement? Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] galaxy-dev Digest, Vol 63, Issue 8
I would also like to voice my support for this feature. I wrote a wrapper for bowtie that converts the SAM output to BAM after bowtie is finished just to avoid the hassle of letting galaxy know that the SAM file existed (didn't want to run Tophat). After thinking about how I would go about deleting an existing output it occurred to me that a deleting tool would require some extra logic since you would probably want to prevent the output port on a workflow node/tool from being connected to the input of another node if the output is going to be deleted. I was wondering if it might make sense to modify the flagged output feature (the asterisk) of the galaxy tools nodes to delete the non-flagged outputs instead of just hiding them? Or perhaps just mark them as deleted so they will be taken care of by the cleanup scripts? In this same line of thinking, it might make sense to have a flag for the input ports that specify that the input will be consumed/deleted after the tool has successfully run. This would address the case where you wanted to use the output of a tool before it is removed. Cheers, Andrew I haven't had a chance to do anything on this yet, but I'll see if I can work something out in the near future. -Dannon On Sep 7, 2011, at 9:34 PM, Glen Beane wrote: On Sep 7, 2011, at 8:10 PM, Edward Kirton wrote: i'm resurrecting this thread to see if there's any more support for the idea of deleting intermediate files in a workflow. i think this is an important feature to have. oftentimes a workflow creates many intermediate files no one will ever look at. and leaving it up to the user to cleanup their data files is asking too much. there's another ticket regarding allow users to still be able to preview the metadata of deleted workflow history items and together these would go together nicely. I am _very_ interested in this feature -- Glen L. Beane Senior Software Engineer The Jackson Laboratory (207) 288-6153 ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/
Re: [galaxy-dev] cuffcompare wrapper
I'm confused. Why would the symlink cause problems for Cuffcompare but not for other tools that use symlinks (including Cufflinks and Cuffdiff)? J. On Sep 8, 2011, at 1:43 PM, Chorny, Ilya wrote: It’s not an issue in the tmp dir but in the job_working_directory. I run most of those other tools with no problems. I don’t think we should make it a requirement across the board and I think we can come up with alternative ways to clean up the job_working_directory. I am hoping that you could add the symlink to the cuffcompare wrapper as it is the only one where the symlink causes me a problem as far as I have tested. We don’t want to have our code base differ to much from galaxy-central. Thanks, Ilya From: Jeremy Goecks [mailto:jeremy.goe...@emory.edu] Sent: Wednesday, September 07, 2011 6:26 PM To: Chorny, Ilya Cc: galaxy-dev@lists.bx.psu.edu Subject: Re: [galaxy-dev] cuffcompare wrapper Ilya, A search of the Galaxy codebase indicates that thirteen tools use symlinks (e.g. GATK, Sicer, Picard, Cuff*, Bowtie), so the changes required to support this new code are significant. (Changes would also likely be needed for tools in the tool shed.) Also, asking tool wrappers to delete symlinks would be an idiosyncratic requirement as tools assume they have a temporary working directory at their disposal. For these reasons, it seems best to have the tool framework clean up symlinks as necessary to support the new code. Best, J. On Sep 7, 2011, at 2:28 PM, Chorny, Ilya wrote: Ok, I figured out why you need the symlink. Can you add an unlink after the process completes? i.e for i, arg in enumerate( args ): input_file_name = ./input%i % ( i+1 ) os.unlink(input_file_name) From: galaxy-dev-boun...@lists.bx.psu.edu [mailto:galaxy-dev-boun...@lists.bx.psu.edu] On Behalf Of Chorny, Ilya Sent: Wednesday, September 07, 2011 9:18 AM To: galaxy-dev@lists.bx.psu.edu Subject: [galaxy-dev] cuffcompare wrapper Hi Jeremy, The symlink in the cuffcompare wrapper was causing galaxy to crash because I run as the actual user and have to chmod the job_working directory at the end so Galaxy can clean it up. Turns out is seems like the symlink is not needed. Am I missing something. See below. Your code: for i, arg in enumerate( args ): input_file_name = ./input%i % ( i+1 ) os.symlink( arg, input_file_name ) cmd += %s % input_file_name My code: for i, arg in enumerate( args ): cmd += arg Ilya Chorny Ph.D. Bioinformatics Scientist I Illumina, Inc. 9885 Towne Centre Drive San Diego, CA 92121 Work: 858.202.4582 Email: icho...@illumina.com Website: www.illumina.com ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: http://lists.bx.psu.edu/