[galaxy-dev] FastQC wrapper not seeing files at gzipped

2015-01-12 Thread Ryan G
Hi all - I've got a bunch of fatsq files uploaded into a data library in
Galaxy.  The underlying files is gzipped however Galaxy strips the .gz from
the filename and displays it as .fastq.  When the python wrapper
rgFastQC.py gets called, it correctly sees the fastq.gz file.  The wrapper
creates a symbolic link to the .gz file in a tmp directory.  The link is
.fastq.  When FastQC tries to read this file, it fails because its
compressed.  So one of two things is going wrong here:

1)  It looks like the wrapper is incorrectly renaming the file, but its
using the name given to it in Galaxy.

2)  When the file is uploaded into the data library, Galaxy is stripping
off the .gz extension.

I think #2 is the more correct problem.  How can I keep Galaxy from
stripping the .gz extension?
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] No API way to delete a galaxy data library folder?

2015-01-12 Thread Dooley, Damion
Hi Martin - Following up on this - I think your API folder delete commit is : 
https://bitbucket.org/galaxy/galaxy-central/commits/8f76a6abc5d7d5c98b6c148c4cfe75cc1c159e90
  ?  I was wondering how to find out more about this API call.  Not knowing the 
guts of the Galaxy API code much, is it a call like:

http://[my galaxy]/api/folders/[my folder id]/delete

I haven't tested since I haven't played with Next-stable galaxy branch.  When 
roughly does that get woven into stable or default?

Regards,

Damion

 Martin Čech | Threaded | More 
 Dec 04, 2014; 4:40pm Re: No API way to delete a galaxy data library folder?
 I have actually implemented this feature and it will be in the next release 
 (which should be made public around next monday).
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped

2015-01-12 Thread Ryan G
To (I think) fix this, I changed line 50 in rgFastQC.py from
infname = self.opts.inputfilename

to
infname = self.opts.input

This will force FastQC to look at the real file and not the renamed
dataset.


On Mon, Jan 12, 2015 at 12:20 PM, Ryan G ngsbioinformat...@gmail.com
wrote:

 Yes, I'm doing a link to file on file system when doing a library import.
 Does this mean I should link to the the uncompressed file?

 On Mon, Jan 12, 2015 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com
 wrote:

 Ah. Then this is more subtle... are you using the
 library import option where Galaxy just symlinks
 to existing files? I thought that was not possible
 with gzipped files (for the reasons given below).
 Perhaps this is not being blocked, leading to the
 confused state you're seeing?

 Peter

 On Mon, Jan 12, 2015 at 4:52 PM, Ryan G ngsbioinformat...@gmail.com
 wrote:
  Galaxy is not decompressing the file.  The file is linked to on the
  filesystem.
 
  On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock p.j.a.c...@googlemail.com
 
  wrote:
 
  Hi Ryan,
 
  The problem isn't Galaxy stripping the extension, rather
  Galaxy is actually decompressing the file as part of the
  upload process.
 
  Unfortunately (and there is an open Trello enhancement
  request on this), Galaxy does not support sorting any of
  the defined datatypes in compressed form UNLESS they
  are defined that way (like BAM files).
 
  This has lead some Galaxy Admins to define a new datatype
  lgzippedfastq (or similar - I'd have to check my old emails
  for the exact name used as a gripped alternative to the
  Galaxy sangerfastq datatype) and then modified many/all
  their tools to handle this. That is a lot of work, but does
  offer big disk savings for this key datatype.
 
  The Galaxy team instead use a compressed file system,
  so for usegalaxy.org ALL their data files are compressed
  but Galaxy can ignore this complexity.
 
  Peter
 
  On Mon, Jan 12, 2015 at 3:15 PM, Ryan G ngsbioinformat...@gmail.com
  wrote:
   Hi all - I've got a bunch of fatsq files uploaded into a data
 library in
   Galaxy.  The underlying files is gzipped however Galaxy strips the
 .gz
   from
   the filename and displays it as .fastq.  When the python wrapper
   rgFastQC.py
   gets called, it correctly sees the fastq.gz file.  The wrapper
 creates a
   symbolic link to the .gz file in a tmp directory.  The link is
 .fastq.
   When
   FastQC tries to read this file, it fails because its compressed.  So
 one
   of
   two things is going wrong here:
  
   1)  It looks like the wrapper is incorrectly renaming the file, but
 its
   using the name given to it in Galaxy.
  
   2)  When the file is uploaded into the data library, Galaxy is
 stripping
   off
   the .gz extension.
  
   I think #2 is the more correct problem.  How can I keep Galaxy from
   stripping the .gz extension?
  
   ___
   Please keep all replies on the list by using reply all
   in your mail client.  To manage your subscriptions to this
   and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/
  
   To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/
 
 



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Partial automation for generating those twisty R dependency tool shed installation sequences

2015-01-12 Thread Ross
Hi  Björn,
I'm a bit old fashioned and think I prefer a proper Galaxy tool rather than
a notebook :) so I've set up a temporary demonstration/test site of a
toolfactory generated tool that does what I think I need - can some kind
soul please test it and let me know how it goes ? If it's useful, it needs
to be adjusted to depend on whatever version of package_R you want to work
with - currently just uses the system R for demonstration purposes.

I used the toolfactory2 (main toolshed) (which now allows any number of
(optionally non editable) parameters!!!) to wrap the script shown at the
bottom of https://wiki.galaxyproject.org/SetUpREnvironment. There are
currently three parameters - the names of the R/BioC packages from
sessionInfo(), the local directory where all the tarballs should be stowed
and the XML output prefix to prepend to each row of the generated XML
stanza for tool_dependencies.xml

The resulting toolshed tarball was uploaded to a local toolshed and then
installed to produce a new tool in the tool generators section
- r_bioc_depgen Generate dependencies for R/BioC packages

If you import the history at http://130.56.252.21/history/list_published
you will see the toolfactory job (#1,#2,#3) - rerunning will show how the
parameters are defined - fugly but it does work.
After generating/uploading/installing the new tool, outputs from a test run
are in #4 and #5 for DESeq

Comments and suggestions welcomed!

On Sun, Jan 11, 2015 at 10:41 PM, Björn Grüning bjoern.gruen...@gmail.com
wrote:

 Hi Ross,

 you are absolutely right.
 My download_store repository is exactly for this purpose.

 https://github.com/bgruening/download_store

 If you are interested we could integrate your additional magic into the
 notebook.

 Thanks,
 Bjoern

 Am 11.01.2015 um 01:33 schrieb Ross:
  Hi, Björn,
  Looks pretty similar!
  Aren't the links your notebook generates transient? I think if you put
 them
  into a tool_dependencies.xml, they will fail permanently immediately
 after
  any of the package authors updates one of the relevant svn repositories?
 
  AFAIK, it looks like the whole BioC/CRAN infrastructure is automated so a
  link that works today like
  http://cran.fhcrc.org/src/contrib/Rcpp_0.11.3.tar.gz will fail when Rcpp
  next gets updated and Rcpp_0.11.3.tar.gz is migrated to
  http://cran.fhcrc.org/src/contrib/00Archive/Rcpp/ with a replacement
 (eg)
  http://cran.fhcrc.org/src/contrib/Rcpp_0.11.4.tar.gz appearing in the
  contrib directory?
 
  That's why my more complex script downloads all the latest archives into
 my
  local github archive repo and generates a permanent link to suit that
  github repo.
  We definitely need an automated solution as this is a really infuriating
  aspect of trying to make code relying on R/BioC packages reproducible.
 
 
  On Thu, Jan 8, 2015 at 11:28 PM, Björn Grüning 
 bjoern.gruen...@gmail.com
  wrote:
 
  Hi Ross,
 
  this is great!
  Have you seen this notebook?
 
 
 
 http://nbviewer.ipython.org/github/bgruening/notebooks/blob/master/R/extract_all_dependencies_from_an_r_package.ipynb
 
  It tries to do the same thing. Maybe it's also worth to mention? Maybe
  we can enhance it?
 
  Thanks,
  Bjoern
 
  Am 08.01.2015 um 08:09 schrieb Ross:
  This may be helpful for anyone else struggling to get complex nested R
  package dependency installation from the tool shed sorted out. That
 whole
  can of worms. While we have setup_r_packages, the developer still has
 to
  figure out the right magical incantation and make sure the tarballs are
  available.
 
  https://wiki.galaxyproject.org/SetUpREnvironment has some notes I've
  started - contribitions welcome.
 
  It has a more or less reusable R script to generate
 tool_dependencies.xml
  boilerplate, assuming you set the constant libdir to your local git
  repository path where those tarballs will be downloaded from.
 
  I hope this helps someone!
 
  Could make a tool to do this if enough developers want access to it
  without
  the pain of managing yet another R script?
 
 
 

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped

2015-01-12 Thread Peter Cock
Hi Ryan,

That is the workaround I am using, which means
keeping an uncompressed copy of the FASTQ
file on our main storage from where Galaxy can
see it (for people to use within their histories).

From a long term storage perspective this is not
ideal - so I am keen for better handling of gzipped
files within Galaxy (particularly within libraries
which we use for raw data).

Peter

On Mon, Jan 12, 2015 at 5:20 PM, Ryan G ngsbioinformat...@gmail.com wrote:
 Yes, I'm doing a link to file on file system when doing a library import.
 Does this mean I should link to the the uncompressed file?

 On Mon, Jan 12, 2015 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com
 wrote:

 Ah. Then this is more subtle... are you using the
 library import option where Galaxy just symlinks
 to existing files? I thought that was not possible
 with gzipped files (for the reasons given below).
 Perhaps this is not being blocked, leading to the
 confused state you're seeing?

 Peter

 On Mon, Jan 12, 2015 at 4:52 PM, Ryan G ngsbioinformat...@gmail.com
 wrote:
  Galaxy is not decompressing the file.  The file is linked to on the
  filesystem.
 
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Tool Development DELLY

2015-01-12 Thread Peter Cock
Hi Marco,

No problem - I originally copied the metadata access
trick from one of the Galaxy dev-team's tool anyway.

Maybe we need to add this to the wiki...

Peter

On Tuesday, January 13, 2015, Marco Albuquerque 
marcoalbuquerque@gmail.com wrote:

 Hi Peter,

 I was unaware of how to access metadata, that seemed to be my issue. The
 tool works now though!

 Thanks so much,

 Marco




 On 2015-01-09 7:13 PM, Peter Cock p.j.a.c...@googlemail.com
 javascript:; wrote:
 
 I think the symlink approach is best, see for example the Python
 wrapper script I used here for samtools idxstats,
 
 https://github.com/peterjc/pico_galaxy/tree/master/tools/samtools_idxstats
 
 However, you can make the link in the XML directly, see Dave's
 reworking of this wrapper:
 
 https://github.com/galaxyproject/tools-devteam/tree/master/tool_collection
 s/samtools/samtools_idxstats
 
 Regards,
 
 Peter



___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

[galaxy-dev] Display of file content in a combobox

2015-01-12 Thread christof.piet...@kws.com
Hello,
I'd like to display the unique column content of a file in a combobox to give 
the user the possibility to select factor levels for subsequent analyses, e.g. 
for combining related factor levels in a meta-analyses.
Unfortunately, I failed to connect the tool that reads out the factor levels 
with the tool that receive the selection via combobox as input argument.
I'd be very thankful for any suggestion!

Christof
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] Galaxy Bioblend option for importing dataset into a library? (Nicola Soranzo)

2015-01-12 Thread Dooley, Damion
 Hi Damion,

 I finally got to implement this, see method copy_from_dataset() in this
commit:

 https://github.com/afgane/bioblend/commit/bc6b7cb71abb25aa109b85b1ff24e73aadac5ce4

Thanks Nicola!

Damion

Hsiao lab, BC Public Health Microbiology  Reference Laboratory, BC Centre for 
Disease Control
655 West 12th Avenue, Vancouver, British Columbia, V5Z 4R4 Canada
___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped

2015-01-12 Thread Ryan G
Galaxy is not decompressing the file.  The file is linked to on the
filesystem.

On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock p.j.a.c...@googlemail.com
wrote:

 Hi Ryan,

 The problem isn't Galaxy stripping the extension, rather
 Galaxy is actually decompressing the file as part of the
 upload process.

 Unfortunately (and there is an open Trello enhancement
 request on this), Galaxy does not support sorting any of
 the defined datatypes in compressed form UNLESS they
 are defined that way (like BAM files).

 This has lead some Galaxy Admins to define a new datatype
 lgzippedfastq (or similar - I'd have to check my old emails
 for the exact name used as a gripped alternative to the
 Galaxy sangerfastq datatype) and then modified many/all
 their tools to handle this. That is a lot of work, but does
 offer big disk savings for this key datatype.

 The Galaxy team instead use a compressed file system,
 so for usegalaxy.org ALL their data files are compressed
 but Galaxy can ignore this complexity.

 Peter

 On Mon, Jan 12, 2015 at 3:15 PM, Ryan G ngsbioinformat...@gmail.com
 wrote:
  Hi all - I've got a bunch of fatsq files uploaded into a data library in
  Galaxy.  The underlying files is gzipped however Galaxy strips the .gz
 from
  the filename and displays it as .fastq.  When the python wrapper
 rgFastQC.py
  gets called, it correctly sees the fastq.gz file.  The wrapper creates a
  symbolic link to the .gz file in a tmp directory.  The link is .fastq.
 When
  FastQC tries to read this file, it fails because its compressed.  So one
 of
  two things is going wrong here:
 
  1)  It looks like the wrapper is incorrectly renaming the file, but its
  using the name given to it in Galaxy.
 
  2)  When the file is uploaded into the data library, Galaxy is stripping
 off
  the .gz extension.
 
  I think #2 is the more correct problem.  How can I keep Galaxy from
  stripping the .gz extension?
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped

2015-01-12 Thread Peter Cock
Ah. Then this is more subtle... are you using the
library import option where Galaxy just symlinks
to existing files? I thought that was not possible
with gzipped files (for the reasons given below).
Perhaps this is not being blocked, leading to the
confused state you're seeing?

Peter

On Mon, Jan 12, 2015 at 4:52 PM, Ryan G ngsbioinformat...@gmail.com wrote:
 Galaxy is not decompressing the file.  The file is linked to on the
 filesystem.

 On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock p.j.a.c...@googlemail.com
 wrote:

 Hi Ryan,

 The problem isn't Galaxy stripping the extension, rather
 Galaxy is actually decompressing the file as part of the
 upload process.

 Unfortunately (and there is an open Trello enhancement
 request on this), Galaxy does not support sorting any of
 the defined datatypes in compressed form UNLESS they
 are defined that way (like BAM files).

 This has lead some Galaxy Admins to define a new datatype
 lgzippedfastq (or similar - I'd have to check my old emails
 for the exact name used as a gripped alternative to the
 Galaxy sangerfastq datatype) and then modified many/all
 their tools to handle this. That is a lot of work, but does
 offer big disk savings for this key datatype.

 The Galaxy team instead use a compressed file system,
 so for usegalaxy.org ALL their data files are compressed
 but Galaxy can ignore this complexity.

 Peter

 On Mon, Jan 12, 2015 at 3:15 PM, Ryan G ngsbioinformat...@gmail.com
 wrote:
  Hi all - I've got a bunch of fatsq files uploaded into a data library in
  Galaxy.  The underlying files is gzipped however Galaxy strips the .gz
  from
  the filename and displays it as .fastq.  When the python wrapper
  rgFastQC.py
  gets called, it correctly sees the fastq.gz file.  The wrapper creates a
  symbolic link to the .gz file in a tmp directory.  The link is .fastq.
  When
  FastQC tries to read this file, it fails because its compressed.  So one
  of
  two things is going wrong here:
 
  1)  It looks like the wrapper is incorrectly renaming the file, but its
  using the name given to it in Galaxy.
 
  2)  When the file is uploaded into the data library, Galaxy is stripping
  off
  the .gz extension.
 
  I think #2 is the more correct problem.  How can I keep Galaxy from
  stripping the .gz extension?
 
  ___
  Please keep all replies on the list by using reply all
  in your mail client.  To manage your subscriptions to this
  and other Galaxy lists, please use the interface at:
https://lists.galaxyproject.org/
 
  To search Galaxy mailing lists use the unified search at:
http://galaxyproject.org/search/mailinglists/


___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-dev] FastQC wrapper not seeing files at gzipped

2015-01-12 Thread Ryan G
Yes, I'm doing a link to file on file system when doing a library import.
Does this mean I should link to the the uncompressed file?

On Mon, Jan 12, 2015 at 12:14 PM, Peter Cock p.j.a.c...@googlemail.com
wrote:

 Ah. Then this is more subtle... are you using the
 library import option where Galaxy just symlinks
 to existing files? I thought that was not possible
 with gzipped files (for the reasons given below).
 Perhaps this is not being blocked, leading to the
 confused state you're seeing?

 Peter

 On Mon, Jan 12, 2015 at 4:52 PM, Ryan G ngsbioinformat...@gmail.com
 wrote:
  Galaxy is not decompressing the file.  The file is linked to on the
  filesystem.
 
  On Mon, Jan 12, 2015 at 10:28 AM, Peter Cock p.j.a.c...@googlemail.com
  wrote:
 
  Hi Ryan,
 
  The problem isn't Galaxy stripping the extension, rather
  Galaxy is actually decompressing the file as part of the
  upload process.
 
  Unfortunately (and there is an open Trello enhancement
  request on this), Galaxy does not support sorting any of
  the defined datatypes in compressed form UNLESS they
  are defined that way (like BAM files).
 
  This has lead some Galaxy Admins to define a new datatype
  lgzippedfastq (or similar - I'd have to check my old emails
  for the exact name used as a gripped alternative to the
  Galaxy sangerfastq datatype) and then modified many/all
  their tools to handle this. That is a lot of work, but does
  offer big disk savings for this key datatype.
 
  The Galaxy team instead use a compressed file system,
  so for usegalaxy.org ALL their data files are compressed
  but Galaxy can ignore this complexity.
 
  Peter
 
  On Mon, Jan 12, 2015 at 3:15 PM, Ryan G ngsbioinformat...@gmail.com
  wrote:
   Hi all - I've got a bunch of fatsq files uploaded into a data library
 in
   Galaxy.  The underlying files is gzipped however Galaxy strips the .gz
   from
   the filename and displays it as .fastq.  When the python wrapper
   rgFastQC.py
   gets called, it correctly sees the fastq.gz file.  The wrapper
 creates a
   symbolic link to the .gz file in a tmp directory.  The link is .fastq.
   When
   FastQC tries to read this file, it fails because its compressed.  So
 one
   of
   two things is going wrong here:
  
   1)  It looks like the wrapper is incorrectly renaming the file, but
 its
   using the name given to it in Galaxy.
  
   2)  When the file is uploaded into the data library, Galaxy is
 stripping
   off
   the .gz extension.
  
   I think #2 is the more correct problem.  How can I keep Galaxy from
   stripping the .gz extension?
  
   ___
   Please keep all replies on the list by using reply all
   in your mail client.  To manage your subscriptions to this
   and other Galaxy lists, please use the interface at:
 https://lists.galaxyproject.org/
  
   To search Galaxy mailing lists use the unified search at:
 http://galaxyproject.org/search/mailinglists/
 
 

___
Please keep all replies on the list by using reply all
in your mail client.  To manage your subscriptions to this
and other Galaxy lists, please use the interface at:
  https://lists.galaxyproject.org/

To search Galaxy mailing lists use the unified search at:
  http://galaxyproject.org/search/mailinglists/