[galaxy-dev] FastQC wrapper not seeing files at gzipped
Hi all - I've got a bunch of fatsq files uploaded into a data library in Galaxy. The underlying files is gzipped however Galaxy strips the .gz from the filename and displays it as .fastq. When the python wrapper rgFastQC.py gets called, it correctly sees the fastq.gz file. The wrapper creates a symbolic link to the .gz file in a tmp directory. The link is .fastq. When FastQC tries to read this file, it fails because its compressed. So one of two things is going wrong here: 1) It looks like the wrapper is incorrectly renaming the file, but its using the name given to it in Galaxy. 2) When the file is uploaded into the data library, Galaxy is stripping off the .gz extension. I think #2 is the more correct problem. How can I keep Galaxy from stripping the .gz extension? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] No API way to delete a galaxy data library folder?
Hi Martin - Following up on this - I think your API folder delete commit is : https://bitbucket.org/galaxy/galaxy-central/commits/8f76a6abc5d7d5c98b6c148c4cfe75cc1c159e90 ? I was wondering how to find out more about this API call. Not knowing the guts of the Galaxy API code much, is it a call like: http://[my galaxy]/api/folders/[my folder id]/delete I haven't tested since I haven't played with Next-stable galaxy branch. When roughly does that get woven into stable or default? Regards, Damion Martin Čech | Threaded | More Dec 04, 2014; 4:40pm Re: No API way to delete a galaxy data library folder? I have actually implemented this feature and it will be in the next release (which should be made public around next monday). ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped
To (I think) fix this, I changed line 50 in rgFastQC.py from infname = self.opts.inputfilename to infname = self.opts.input This will force FastQC to look at the real file and not the renamed dataset. On Mon, Jan 12, 2015 at 12:20 PM, Ryan G ngsbioinformat...@gmail.com wrote: Yes, I'm doing a link to file on file system when doing a library import. Does this mean I should link to the the uncompressed file? On Mon, Jan 12, 2015 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com wrote: Ah. Then this is more subtle... are you using the library import option where Galaxy just symlinks to existing files? I thought that was not possible with gzipped files (for the reasons given below). Perhaps this is not being blocked, leading to the confused state you're seeing? Peter On Mon, Jan 12, 2015 at 4:52 PM, Ryan G ngsbioinformat...@gmail.com wrote: Galaxy is not decompressing the file. The file is linked to on the filesystem. On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi Ryan, The problem isn't Galaxy stripping the extension, rather Galaxy is actually decompressing the file as part of the upload process. Unfortunately (and there is an open Trello enhancement request on this), Galaxy does not support sorting any of the defined datatypes in compressed form UNLESS they are defined that way (like BAM files). This has lead some Galaxy Admins to define a new datatype lgzippedfastq (or similar - I'd have to check my old emails for the exact name used as a gripped alternative to the Galaxy sangerfastq datatype) and then modified many/all their tools to handle this. That is a lot of work, but does offer big disk savings for this key datatype. The Galaxy team instead use a compressed file system, so for usegalaxy.org ALL their data files are compressed but Galaxy can ignore this complexity. Peter On Mon, Jan 12, 2015 at 3:15 PM, Ryan G ngsbioinformat...@gmail.com wrote: Hi all - I've got a bunch of fatsq files uploaded into a data library in Galaxy. The underlying files is gzipped however Galaxy strips the .gz from the filename and displays it as .fastq. When the python wrapper rgFastQC.py gets called, it correctly sees the fastq.gz file. The wrapper creates a symbolic link to the .gz file in a tmp directory. The link is .fastq. When FastQC tries to read this file, it fails because its compressed. So one of two things is going wrong here: 1) It looks like the wrapper is incorrectly renaming the file, but its using the name given to it in Galaxy. 2) When the file is uploaded into the data library, Galaxy is stripping off the .gz extension. I think #2 is the more correct problem. How can I keep Galaxy from stripping the .gz extension? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Partial automation for generating those twisty R dependency tool shed installation sequences
Hi Björn, I'm a bit old fashioned and think I prefer a proper Galaxy tool rather than a notebook :) so I've set up a temporary demonstration/test site of a toolfactory generated tool that does what I think I need - can some kind soul please test it and let me know how it goes ? If it's useful, it needs to be adjusted to depend on whatever version of package_R you want to work with - currently just uses the system R for demonstration purposes. I used the toolfactory2 (main toolshed) (which now allows any number of (optionally non editable) parameters!!!) to wrap the script shown at the bottom of https://wiki.galaxyproject.org/SetUpREnvironment. There are currently three parameters - the names of the R/BioC packages from sessionInfo(), the local directory where all the tarballs should be stowed and the XML output prefix to prepend to each row of the generated XML stanza for tool_dependencies.xml The resulting toolshed tarball was uploaded to a local toolshed and then installed to produce a new tool in the tool generators section - r_bioc_depgen Generate dependencies for R/BioC packages If you import the history at http://130.56.252.21/history/list_published you will see the toolfactory job (#1,#2,#3) - rerunning will show how the parameters are defined - fugly but it does work. After generating/uploading/installing the new tool, outputs from a test run are in #4 and #5 for DESeq Comments and suggestions welcomed! On Sun, Jan 11, 2015 at 10:41 PM, Björn Grüning bjoern.gruen...@gmail.com wrote: Hi Ross, you are absolutely right. My download_store repository is exactly for this purpose. https://github.com/bgruening/download_store If you are interested we could integrate your additional magic into the notebook. Thanks, Bjoern Am 11.01.2015 um 01:33 schrieb Ross: Hi, Björn, Looks pretty similar! Aren't the links your notebook generates transient? I think if you put them into a tool_dependencies.xml, they will fail permanently immediately after any of the package authors updates one of the relevant svn repositories? AFAIK, it looks like the whole BioC/CRAN infrastructure is automated so a link that works today like http://cran.fhcrc.org/src/contrib/Rcpp_0.11.3.tar.gz will fail when Rcpp next gets updated and Rcpp_0.11.3.tar.gz is migrated to http://cran.fhcrc.org/src/contrib/00Archive/Rcpp/ with a replacement (eg) http://cran.fhcrc.org/src/contrib/Rcpp_0.11.4.tar.gz appearing in the contrib directory? That's why my more complex script downloads all the latest archives into my local github archive repo and generates a permanent link to suit that github repo. We definitely need an automated solution as this is a really infuriating aspect of trying to make code relying on R/BioC packages reproducible. On Thu, Jan 8, 2015 at 11:28 PM, Björn Grüning bjoern.gruen...@gmail.com wrote: Hi Ross, this is great! Have you seen this notebook? http://nbviewer.ipython.org/github/bgruening/notebooks/blob/master/R/extract_all_dependencies_from_an_r_package.ipynb It tries to do the same thing. Maybe it's also worth to mention? Maybe we can enhance it? Thanks, Bjoern Am 08.01.2015 um 08:09 schrieb Ross: This may be helpful for anyone else struggling to get complex nested R package dependency installation from the tool shed sorted out. That whole can of worms. While we have setup_r_packages, the developer still has to figure out the right magical incantation and make sure the tarballs are available. https://wiki.galaxyproject.org/SetUpREnvironment has some notes I've started - contribitions welcome. It has a more or less reusable R script to generate tool_dependencies.xml boilerplate, assuming you set the constant libdir to your local git repository path where those tarballs will be downloaded from. I hope this helps someone! Could make a tool to do this if enough developers want access to it without the pain of managing yet another R script? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped
Hi Ryan, That is the workaround I am using, which means keeping an uncompressed copy of the FASTQ file on our main storage from where Galaxy can see it (for people to use within their histories). From a long term storage perspective this is not ideal - so I am keen for better handling of gzipped files within Galaxy (particularly within libraries which we use for raw data). Peter On Mon, Jan 12, 2015 at 5:20 PM, Ryan G ngsbioinformat...@gmail.com wrote: Yes, I'm doing a link to file on file system when doing a library import. Does this mean I should link to the the uncompressed file? On Mon, Jan 12, 2015 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com wrote: Ah. Then this is more subtle... are you using the library import option where Galaxy just symlinks to existing files? I thought that was not possible with gzipped files (for the reasons given below). Perhaps this is not being blocked, leading to the confused state you're seeing? Peter On Mon, Jan 12, 2015 at 4:52 PM, Ryan G ngsbioinformat...@gmail.com wrote: Galaxy is not decompressing the file. The file is linked to on the filesystem. ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Tool Development DELLY
Hi Marco, No problem - I originally copied the metadata access trick from one of the Galaxy dev-team's tool anyway. Maybe we need to add this to the wiki... Peter On Tuesday, January 13, 2015, Marco Albuquerque marcoalbuquerque@gmail.com wrote: Hi Peter, I was unaware of how to access metadata, that seemed to be my issue. The tool works now though! Thanks so much, Marco On 2015-01-09 7:13 PM, Peter Cock p.j.a.c...@googlemail.com javascript:; wrote: I think the symlink approach is best, see for example the Python wrapper script I used here for samtools idxstats, https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_idxstats However, you can make the link in the XML directly, see Dave's reworking of this wrapper: https://github.com/galaxyproject/tools-devteam/tree/master/tool_collection s/samtools/samtools_idxstats Regards, Peter ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
[galaxy-dev] Display of file content in a combobox
Hello, I'd like to display the unique column content of a file in a combobox to give the user the possibility to select factor levels for subsequent analyses, e.g. for combining related factor levels in a meta-analyses. Unfortunately, I failed to connect the tool that reads out the factor levels with the tool that receive the selection via combobox as input argument. I'd be very thankful for any suggestion! Christof ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] Galaxy Bioblend option for importing dataset into a library? (Nicola Soranzo)
Hi Damion, I finally got to implement this, see method copy_from_dataset() in this commit: https://github.com/afgane/bioblend/commit/bc6b7cb71abb25aa109b85b1ff24e73aadac5ce4 Thanks Nicola! Damion Hsiao lab, BC Public Health Microbiology Reference Laboratory, BC Centre for Disease Control 655 West 12th Avenue, Vancouver, British Columbia, V5Z 4R4 Canada ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped
Galaxy is not decompressing the file. The file is linked to on the filesystem. On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi Ryan, The problem isn't Galaxy stripping the extension, rather Galaxy is actually decompressing the file as part of the upload process. Unfortunately (and there is an open Trello enhancement request on this), Galaxy does not support sorting any of the defined datatypes in compressed form UNLESS they are defined that way (like BAM files). This has lead some Galaxy Admins to define a new datatype lgzippedfastq (or similar - I'd have to check my old emails for the exact name used as a gripped alternative to the Galaxy sangerfastq datatype) and then modified many/all their tools to handle this. That is a lot of work, but does offer big disk savings for this key datatype. The Galaxy team instead use a compressed file system, so for usegalaxy.org ALL their data files are compressed but Galaxy can ignore this complexity. Peter On Mon, Jan 12, 2015 at 3:15 PM, Ryan G ngsbioinformat...@gmail.com wrote: Hi all - I've got a bunch of fatsq files uploaded into a data library in Galaxy. The underlying files is gzipped however Galaxy strips the .gz from the filename and displays it as .fastq. When the python wrapper rgFastQC.py gets called, it correctly sees the fastq.gz file. The wrapper creates a symbolic link to the .gz file in a tmp directory. The link is .fastq. When FastQC tries to read this file, it fails because its compressed. So one of two things is going wrong here: 1) It looks like the wrapper is incorrectly renaming the file, but its using the name given to it in Galaxy. 2) When the file is uploaded into the data library, Galaxy is stripping off the .gz extension. I think #2 is the more correct problem. How can I keep Galaxy from stripping the .gz extension? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped
Ah. Then this is more subtle... are you using the library import option where Galaxy just symlinks to existing files? I thought that was not possible with gzipped files (for the reasons given below). Perhaps this is not being blocked, leading to the confused state you're seeing? Peter On Mon, Jan 12, 2015 at 4:52 PM, Ryan G ngsbioinformat...@gmail.com wrote: Galaxy is not decompressing the file. The file is linked to on the filesystem. On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi Ryan, The problem isn't Galaxy stripping the extension, rather Galaxy is actually decompressing the file as part of the upload process. Unfortunately (and there is an open Trello enhancement request on this), Galaxy does not support sorting any of the defined datatypes in compressed form UNLESS they are defined that way (like BAM files). This has lead some Galaxy Admins to define a new datatype lgzippedfastq (or similar - I'd have to check my old emails for the exact name used as a gripped alternative to the Galaxy sangerfastq datatype) and then modified many/all their tools to handle this. That is a lot of work, but does offer big disk savings for this key datatype. The Galaxy team instead use a compressed file system, so for usegalaxy.org ALL their data files are compressed but Galaxy can ignore this complexity. Peter On Mon, Jan 12, 2015 at 3:15 PM, Ryan G ngsbioinformat...@gmail.com wrote: Hi all - I've got a bunch of fatsq files uploaded into a data library in Galaxy. The underlying files is gzipped however Galaxy strips the .gz from the filename and displays it as .fastq. When the python wrapper rgFastQC.py gets called, it correctly sees the fastq.gz file. The wrapper creates a symbolic link to the .gz file in a tmp directory. The link is .fastq. When FastQC tries to read this file, it fails because its compressed. So one of two things is going wrong here: 1) It looks like the wrapper is incorrectly renaming the file, but its using the name given to it in Galaxy. 2) When the file is uploaded into the data library, Galaxy is stripping off the .gz extension. I think #2 is the more correct problem. How can I keep Galaxy from stripping the .gz extension? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/
Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped
Yes, I'm doing a link to file on file system when doing a library import. Does this mean I should link to the the uncompressed file? On Mon, Jan 12, 2015 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com wrote: Ah. Then this is more subtle... are you using the library import option where Galaxy just symlinks to existing files? I thought that was not possible with gzipped files (for the reasons given below). Perhaps this is not being blocked, leading to the confused state you're seeing? Peter On Mon, Jan 12, 2015 at 4:52 PM, Ryan G ngsbioinformat...@gmail.com wrote: Galaxy is not decompressing the file. The file is linked to on the filesystem. On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock p.j.a.c...@googlemail.com wrote: Hi Ryan, The problem isn't Galaxy stripping the extension, rather Galaxy is actually decompressing the file as part of the upload process. Unfortunately (and there is an open Trello enhancement request on this), Galaxy does not support sorting any of the defined datatypes in compressed form UNLESS they are defined that way (like BAM files). This has lead some Galaxy Admins to define a new datatype lgzippedfastq (or similar - I'd have to check my old emails for the exact name used as a gripped alternative to the Galaxy sangerfastq datatype) and then modified many/all their tools to handle this. That is a lot of work, but does offer big disk savings for this key datatype. The Galaxy team instead use a compressed file system, so for usegalaxy.org ALL their data files are compressed but Galaxy can ignore this complexity. Peter On Mon, Jan 12, 2015 at 3:15 PM, Ryan G ngsbioinformat...@gmail.com wrote: Hi all - I've got a bunch of fatsq files uploaded into a data library in Galaxy. The underlying files is gzipped however Galaxy strips the .gz from the filename and displays it as .fastq. When the python wrapper rgFastQC.py gets called, it correctly sees the fastq.gz file. The wrapper creates a symbolic link to the .gz file in a tmp directory. The link is .fastq. When FastQC tries to read this file, it fails because its compressed. So one of two things is going wrong here: 1) It looks like the wrapper is incorrectly renaming the file, but its using the name given to it in Galaxy. 2) When the file is uploaded into the data library, Galaxy is stripping off the .gz extension. I think #2 is the more correct problem. How can I keep Galaxy from stripping the .gz extension? ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/ ___ Please keep all replies on the list by using reply all in your mail client. To manage your subscriptions to this and other Galaxy lists, please use the interface at: https://lists.galaxyproject.org/ To search Galaxy mailing lists use the unified search at: http://galaxyproject.org/search/mailinglists/