Re: [galaxy-user] Problem with executable file in FastQC

2013-12-09 Thread Ross
Hi, Jorge,
Galaxy source includes the Galaxy interfaces but not the third party
executables for tools like fastqc or bwa. They can be automagically
installed if you install the tool from a tool shed but at a guess, you are
working on your desktop with the fastqc tool in a recent clone of galaxy?
Unfortunately, the tool can't run until you have the fastqc software
working in a particular way hinted at in the guide for the fastqc tool on
the page at http://wiki.galaxyproject.org/Admin/Tools/Tool%20Dependencies -
try that please?

There are things to do to make your instance more reliable and stable too -
eg http://wiki.galaxyproject.org/Admin/Config/Performance/ProductionServer




On Mon, Dec 9, 2013 at 9:48 PM, Jorge Braun braun_...@hotmail.com wrote:

 Hi mates!!

 I have a problem with Galaxy... FastQC doesn't run because the file
 FastQC.py cann't find executable file FastQC.xml. Almost, that file is in
 the same directory called rgenetics, (seeing Linux terminal) .

 someone can help me?

 thanks and have a nice day :)



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] Counts of mapped reads for each gene?

2013-08-22 Thread Ross
Hi, Yan
The  htseq_bams_to_count_matrix tool in the test toolshed might be worth a
try - it creates tabular count matrices from any number of individual
sample bam/sam files (it is NOT read group aware!). Each row contains the
count for that contig for each sample. It uses HTSeq code and you supply
your favourite gene model as a GTF file for defining the regions to count
and how to amalgamate - eg count reads overlapping exons and sum those into
total counts for each gene. Please give it a try. Install from the admin
interface and let me know how you get on. There's a companion tool
differential_count_models
 also in the test toolshed that includes edgeR, DESeq2 and VOOM from
Bioconductor - it runs 1 or 2 way GLMs using the count matrices generated
by the htseq tool - be warned that it takes a long time to install
everything so be patient and allow 20 minutes or so for the installation to
finish because it compiles and installs R 3.0.1 and Bioconductor packages.

Suggestions for improvement or bug reports always welcomed. Good luck.


On Thu, Aug 22, 2013 at 3:35 PM, Yan He yanh...@hotmail.com wrote:

 Hi Jen and other galaxy-users,

 ** **

 I am analyzing our RNA-seq data. First, I mapped the RNA-seq data to the
 reference genome. I am wondering if there is a tool that could count the
 number of reads that mapped to each gene. That’s important information for
 my subsequent analysis. Any reply is highly appreciated! Thanks,

 ** **

 Yan

 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:

   http://galaxyproject.org/search/mailinglists/

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] Rename tool output file in Galaxy

2013-07-24 Thread Ross
Sachit, that's a feature, not a bug.
Changing the name of a file on disk being managed by Galaxy is unlikely to
have a happy ending.
The pencil icon allows changes to the history display name but the disk
file name needs to be left alone.



On Wed, Jul 24, 2013 at 8:15 PM, Sachit Adhikari 
sachit.techner...@gmail.com wrote:

 Yes, but unfortunately that didn't work too. I have already tried the
 solution you provided. I could edit the tags and annotations but it won't
 change the file name in my file system. The file is still named as *
 dataset_43649.dat. *


 On Wed, Jul 24, 2013 at 3:52 PM, Peter Cock p.j.a.c...@googlemail.comwrote:

 On Wed, Jul 24, 2013 at 10:35 AM, Sachit Adhikari
 sachit.techner...@gmail.com wrote:
  Hi,
 
  By i do you mean Edit attributes button? There are four sub-headings
 on
  that: Attributes, Convert format, Datatype and permissions but I can't
 see
  data file's full path on any of them. What's wrong?

 No, not the edit button.

 First click on a dataset's title so it expands. You should see a snippet
 of data etc plus a row of icons (save, information, reload in bottom
 left of the history, and tags and annotation bottom right).

 Peter



 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

   http://lists.bx.psu.edu/

 To search Galaxy mailing lists use the unified search at:

   http://galaxyproject.org/search/mailinglists/




-- 
Ross Lazarus MBBS MPH;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444
http://scholar.google.com/citations?hl=enuser=UCUuEM4J
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] Exceptionally high RPKM values of miRNA and other short genes in Cuffdiff's output

2013-07-18 Thread Ross
Hi, Thanh,
If your primary goal is inference about differential 'gene' expression
taking biological variability into account with biological replicates for
each of two conditions, you might want (eg see Dillies et al.,
http://bib.oxfordjournals.org/content/early/2012/09/15/bib.bbs046.long and
http://wiki.galaxyproject.org/Events/GCC2013/Abstracts#Events.2FGCC2013.2FAbstracts.2FPosters.P4:_Comparing_R-based_methods_and_Cuffdiff2_for_analysis_of_RNA-seq_data_in_Galaxy)
to try (and compare!) edgeR (and optionally DESeq and VOOM/limma). A set of
*very much beta* tools is available for admin installation and user testing
from the test toolshed in the statistics section owned by fubar.

The edgeR tool can optionally run 2 way GLM. It requires raw count matrices
as inputs which can be generated from a GTF/'gene' model of your choice and
any number of mapped SAM/BAM inputs using the htseq based companion tool in
the same tool shed section. Please don't install to a production machine
yet but we're getting good results from it - feedback and code improvements
are welcomed from willing beta testers.

The R 3.0.x tool shed dependency package in particular is still under
development and is likely to change substantially in the next week or two
as we sort out a sane and generalised Atlas dependency installation.


On Fri, Jul 19, 2013 at 2:55 AM, Hoang, Thanh hoan...@miamioh.edu wrote:

 Hi all,
 I have been analyzing my RNA-seq data on mouse tissues. My RNA-data is
 single-ended and 51 bp in length. I ran TopHat/Cufflink/Cuffdiff to test to
 differential gene expression
 In the Cuffdiff's output, I got very high RPKM value for some of miRNA and
 some other short genes ( less than 100bp). These genes are in the top genes
 with the highest RPKM. I think the RPKM values of these genes are probably
  too high to be true.
   *test_id* *gene_id* *gene* *locus* *sample_1* *sample_2* *status* *
 value_1* *value_2* *log2(fold_change)* *test_stat* *p_value* *q_value* *
 significant*  *ENSMUSG0093077* *ENSMUSG0093077* *Mir5105* *
 5:146231229-146302874* *Epithelium* *Fiber* *OK* *1.53E+06* *  445558* *
 -1.78097* *-355.367* *0.00715* *0.016986* *yes*  *ENSMUSG0093098* *
 ENSMUSG0093098* *Gm22641* *7:130162450-133124354* *Epithelium* *Fiber*
 *OK* *87894.1* * 36474.7* *-1.26887* *-0.59863* *0.4913* *0.587174* *no*
 *ENSMUSG0089855* *ENSMUSG0089855* *Gm15662* *
 10:105187662-105583874* *Epithelium* *Fiber* *OK* *42868.9* * 21566.5* *
 -0.99114* *-20.7066* *0.0186* *0.039568* *yes*  *ENSMUSG0092984* *
 ENSMUSG0092984* *Mir5115* *2:73012853-73012927* *Epithelium* *Fiber* *
 OK* *21104.8* * 8317.49* *-1.34335* *-447.314* *0.0001* *0.000354* *yes*
 *ENSMUSG0086324* *ENSMUSG0086324* *Gm15564* *16:35926510-36037131*
 *Epithelium* *Fiber* *OK* *6443.35* * 3664.15* *-0.81433* *-1.52095* *
 0.2129* *0.301429* *no*  *ENSMUSG0092981* *ENSMUSG0092981* *
 Mir5125* *17:23803186-23824739* *Epithelium* *Fiber* *OK* *5974.14* *
 2390.75* *-1.32127* *-0.34111* *0.5746* *0.661937* *no*

  I checked some forums and they said that this is the drawback of
 TopHat/Cufflink/Cuffdiff when dealing with short genes. But I am still not
 so clear about this. Anyone got the same problem? What can I do with this
 situation?
 Anyone suggests any other good tools to test for (1) differential gene
 expression OR (2) both differential gene expression and gene discovery?

 Thank you
 Thanh

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

To search Galaxy mailing lists use the unified search at:

  http://galaxyproject.org/search/mailinglists/

Re: [galaxy-user] Upload files from filesystem paths

2013-01-03 Thread Ross
Hi, Neal,
Thanks - that sounds interesting. Like I said, composite datatypes are
designed to manage collections of related files as a unit and this sounds
like a potential use case. There are lots of tools and lots of code that
can serve as examples but it's definitely not trivial because you will
almost certainly be subclassing the Html data class and writing methods to
manage those related files (ie extending the guts of Galaxy) and your tools
will all need to know how to deal with the managed structure when they get
one as an input.

You may need to find or build up a programmer with some relevant Galaxy
composite datatype experience. There is some documentation but it's not
extensive or transparent.

Good luck.


On Fri, Jan 4, 2013 at 9:19 AM, neil.burd...@csiro.au wrote:

 Hi Ross,

  I don’t know of any tools that work in the way I want, but
 I’m not an expert on tools within Galaxy. Essentially the data in the
 directories will be fixed. We run a tool from Galaxy that generates some
 output data, this data then “checks” the data located under the directories
 I am trying to upload to Galaxy. There will probably be around 20
 directories, and the data produced would then search these directories
 looking for “a closest match” once located it would use the remaining files
 in the directory to complete the process.

 ** **

 So for example, the application is segmenting an image, so a part of the
 image is the output. This is compared with files in the uploaded
 directories and a file in a particular directory is chosen (as the closest
 match) then the remaining files in the directory are then used to complete
 the process.

 ** **

 Does that make sense? There would be around 20 files in each directory.***
 *

 ** **

 Thanks

 Neil

 ** **

 *From:* Ross [mailto:ross.laza...@gmail.com]
 *Sent:* Thursday, 3 January 2013 2:24 PM

 *To:* Burdett, Neil (ICT Centre, Herston - RBWH)
 *Cc:* galaxy-user
 *Subject:* Re: [galaxy-user] Upload files from filesystem paths

 ** **

 Neil, 

 It would help if you could point to an existing tool that works the way
 you want. I don't know of any that deal with arbitrary nested directories
 containing arbitrary files. A new composite datatype could impose a
 structure that a tool could be written to deal with (eg the pbed datatype
 used in some rgenetics tools) but arbitrary data structures are not going
 to be possible AFAIK. You're unlikely to get useful help without a much
 more complete and clear explanation of the problem.

 ** **

 ** **

 On Thu, Jan 3, 2013 at 1:50 PM, neil.burd...@csiro.au wrote:

 Hi Ross,
  I think I need to clarify. I have a file in
 /home/galaxy/data-test/dir1/dir2/somefile.txt

 Under the Upload files from filesystem paths, In the path to upload
 window I paste /home/galaxy/data-test. This then puts the somefile.txt
 in the /home/galaxy/galaxy-dist/database/files/000 directory. However, I
 elected to keep the directory structure. I can see this if I navigate
 through the shared data tab but where is this information stored under
 the galaxt-dist structure. As my application needs to have the directory
 structure kept, so need to access it from the xml/command line

 I thought it might have been something like:
 /home/galaxy/galaxy-dist/database/files/000/data-test/dir1/dir2/dataset_id.dat.
 But this is not the case rather
  /home/galaxy/galaxy-dist/database/files/000/dataset_id.dat. i.e. no
 directory structure. So how can I access this information from the xml
 files in the tools directory?

 Thanks
 Neil
 
 From: Ross [ross.laza...@gmail.com]
 Sent: Wednesday, January 02, 2013 4:43 PM
 To: Burdett, Neil (ICT Centre, Herston - RBWH)
 Cc: galaxy-user
 Subject: Re: [galaxy-user] Upload files from filesystem paths


 Try importing those library files to the history where you want them -
 browse the Galaxy 'shared data' tab to where you uploaded them.

 

 On Wed, Jan 2, 2013 at 11:39 AM, neil.burd...@csiro.aumailto:
 neil.burd...@csiro.au wrote:
 Hi,
I have a local galaxy installation.

 I've created a data library, selected Upload files from filesystem
 paths, pasted a path in the path to upload window, and I've selected to
 preserve the directory structure. And the files get imported.

 How do I now access these files from my application? I don't want to
 import them into the history as then they lose the directory structure. I
 can't see where they are physically under the galaxy-dist structure

 Thanks for any help

 Neil

 

 ** **

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http

Re: [galaxy-user] Upload files from filesystem paths

2013-01-03 Thread Ross
For the simplest case, start with the tools/rgenetics/rgFastQC tool - it
doesn't need a subclass but uses the Html datatype files_path as a simple
multiple file bucket.
Once you've got that all figured out, check out the rgenetics datatypes (eg
pbed) subclassed from Html defined in lib/galaxy/datatypes/genetics and the
tools that use it (eg TDT or CaCo tools) in tools/rgenetics for more
complex hackery keeping related files needed by plink together.


On Fri, Jan 4, 2013 at 9:50 AM, neil.burd...@csiro.au wrote:

 Thanks for the help Ross.

 ** **

 Any chance you can point me to the examples you mentioned?

 ** **

 Thanks again

 Neil

 ** **

 *From:* Ross [mailto:ross.laza...@gmail.com]
 *Sent:* Friday, 4 January 2013 8:35 AM

 *To:* Burdett, Neil (ICT Centre, Herston - RBWH)
 *Cc:* galaxy-user
 *Subject:* Re: [galaxy-user] Upload files from filesystem paths

 ** **

 Hi, Neal,

 Thanks - that sounds interesting. Like I said, composite datatypes are
 designed to manage collections of related files as a unit and this sounds
 like a potential use case. There are lots of tools and lots of code that
 can serve as examples but it's definitely not trivial because you will
 almost certainly be subclassing the Html data class and writing methods to
 manage those related files (ie extending the guts of Galaxy) and your tools
 will all need to know how to deal with the managed structure when they get
 one as an input. 

 ** **

 You may need to find or build up a programmer with some relevant Galaxy
 composite datatype experience. There is some documentation but it's not
 extensive or transparent. 

 ** **

 Good luck.

 ** **

 On Fri, Jan 4, 2013 at 9:19 AM, neil.burd...@csiro.au wrote:

 Hi Ross,

  I don’t know of any tools that work in the way I want, but
 I’m not an expert on tools within Galaxy. Essentially the data in the
 directories will be fixed. We run a tool from Galaxy that generates some
 output data, this data then “checks” the data located under the directories
 I am trying to upload to Galaxy. There will probably be around 20
 directories, and the data produced would then search these directories
 looking for “a closest match” once located it would use the remaining files
 in the directory to complete the process.

  

 So for example, the application is segmenting an image, so a part of the
 image is the output. This is compared with files in the uploaded
 directories and a file in a particular directory is chosen (as the closest
 match) then the remaining files in the directory are then used to complete
 the process.

  

 Does that make sense? There would be around 20 files in each directory.***
 *

  

 Thanks

 Neil

  

 *From:* Ross [mailto:ross.laza...@gmail.com]
 *Sent:* Thursday, 3 January 2013 2:24 PM


 *To:* Burdett, Neil (ICT Centre, Herston - RBWH)
 *Cc:* galaxy-user
 *Subject:* Re: [galaxy-user] Upload files from filesystem paths

  

 Neil, 

 It would help if you could point to an existing tool that works the way
 you want. I don't know of any that deal with arbitrary nested directories
 containing arbitrary files. A new composite datatype could impose a
 structure that a tool could be written to deal with (eg the pbed datatype
 used in some rgenetics tools) but arbitrary data structures are not going
 to be possible AFAIK. You're unlikely to get useful help without a much
 more complete and clear explanation of the problem.

  

  

 On Thu, Jan 3, 2013 at 1:50 PM, neil.burd...@csiro.au wrote:

 Hi Ross,
  I think I need to clarify. I have a file in
 /home/galaxy/data-test/dir1/dir2/somefile.txt

 Under the Upload files from filesystem paths, In the path to upload
 window I paste /home/galaxy/data-test. This then puts the somefile.txt
 in the /home/galaxy/galaxy-dist/database/files/000 directory. However, I
 elected to keep the directory structure. I can see this if I navigate
 through the shared data tab but where is this information stored under
 the galaxt-dist structure. As my application needs to have the directory
 structure kept, so need to access it from the xml/command line

 I thought it might have been something like:
 /home/galaxy/galaxy-dist/database/files/000/data-test/dir1/dir2/dataset_id.dat.
 But this is not the case rather
  /home/galaxy/galaxy-dist/database/files/000/dataset_id.dat. i.e. no
 directory structure. So how can I access this information from the xml
 files in the tools directory?

 Thanks
 Neil
 
 From: Ross [ross.laza...@gmail.com]
 Sent: Wednesday, January 02, 2013 4:43 PM
 To: Burdett, Neil (ICT Centre, Herston - RBWH)
 Cc: galaxy-user
 Subject: Re: [galaxy-user] Upload files from filesystem paths


 Try importing those library files to the history where you want them -
 browse the Galaxy 'shared data' tab to where you uploaded them.

 On Wed, Jan 2, 2013 at 11:39

Re: [galaxy-user] Upload files from filesystem paths

2013-01-02 Thread Ross
Neil,
It would help if you could point to an existing tool that works the way you
want. I don't know of any that deal with arbitrary nested directories
containing arbitrary files. A new composite datatype could impose a
structure that a tool could be written to deal with (eg the pbed datatype
used in some rgenetics tools) but arbitrary data structures are not going
to be possible AFAIK. You're unlikely to get useful help without a much
more complete and clear explanation of the problem.


On Thu, Jan 3, 2013 at 1:50 PM, neil.burd...@csiro.au wrote:

 Hi Ross,
  I think I need to clarify. I have a file in
 /home/galaxy/data-test/dir1/dir2/somefile.txt

 Under the Upload files from filesystem paths, In the path to upload
 window I paste /home/galaxy/data-test. This then puts the somefile.txt
 in the /home/galaxy/galaxy-dist/database/files/000 directory. However, I
 elected to keep the directory structure. I can see this if I navigate
 through the shared data tab but where is this information stored under
 the galaxt-dist structure. As my application needs to have the directory
 structure kept, so need to access it from the xml/command line

 I thought it might have been something like:
 /home/galaxy/galaxy-dist/database/files/000/data-test/dir1/dir2/dataset_id.dat.
 But this is not the case rather
  /home/galaxy/galaxy-dist/database/files/000/dataset_id.dat. i.e. no
 directory structure. So how can I access this information from the xml
 files in the tools directory?

 Thanks
 Neil
 
 From: Ross [ross.laza...@gmail.com]
 Sent: Wednesday, January 02, 2013 4:43 PM
 To: Burdett, Neil (ICT Centre, Herston - RBWH)
 Cc: galaxy-user
 Subject: Re: [galaxy-user] Upload files from filesystem paths

 Try importing those library files to the history where you want them -
 browse the Galaxy 'shared data' tab to where you uploaded them.


 On Wed, Jan 2, 2013 at 11:39 AM, neil.burd...@csiro.aumailto:
 neil.burd...@csiro.au wrote:
 Hi,
I have a local galaxy installation.

 I've created a data library, selected Upload files from filesystem
 paths, pasted a path in the path to upload window, and I've selected to
 preserve the directory structure. And the files get imported.

 How do I now access these files from my application? I don't want to
 import them into the history as then they lose the directory structure. I
 can't see where they are physically under the galaxy-dist structure

 Thanks for any help

 Neil



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Upload files from filesystem paths

2013-01-01 Thread Ross
Try importing those library files to the history where you want them -
browse the Galaxy 'shared data' tab to where you uploaded them.


On Wed, Jan 2, 2013 at 11:39 AM, neil.burd...@csiro.au wrote:

 Hi,
I have a local galaxy installation.

 I've created a data library, selected Upload files from filesystem
 paths, pasted a path in the path to upload window, and I've selected to
 preserve the directory structure. And the files get imported.

 How do I now access these files from my application? I don't want to
 import them into the history as then they lose the directory structure. I
 can't see where they are physically under the galaxy-dist structure

 Thanks for any help

 Neil


___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] Identification of replicate outlier

2012-11-08 Thread Ross
Hi Dave,
This is an interesting and non-trivial question that extends well
beyond Galaxy - and there's no simple solution AFAIK
Defining an 'outlier' tends to boil down to subjective judgement in
most real cases I've seen.
EG: see 
http://comments.gmane.org/gmane.science.biology.informatics.conductor/40927

My 2c worth:
a) confirm that all of your sample library sizes and quality score
distributions are comparable with the FastQC tool. A sample with
relatively low library size may indicate an upstream technical failure
with (eg) RNA extraction or a flowcell lane.
b) check that the number of unique alignments to the reference are
similar (eg picard alignment summary metrics or even the samtools
flagstat tool)
c) if you can create an appropriate input matrix (read counts by exon
or other contig for each sample eg), the Principal Component Analysis
tool might be helpful (library size normalization is one devil that
lies in the detail and it's not quite the same as MDS - see below)
d) If you're an R hacker, you might find
http://gettinggeneticsdone.blogspot.com.au/2012/09/deseq-vs-edger-comparison.html
useful - it shows how to get MDS plots which are probably the most
reliable way to identify samples that don't cluster well with the
other members of their tribe



On Fri, Nov 9, 2012 at 10:22 AM, Dave Corney dcor...@princeton.edu wrote:
 Hello list,

 I've been analyzing an experiment with two groups each with three
 replicates. My workflow was TopHat (paired end) - Cufflinks - CuffDiff.
 Unfortunately, there are not many significant differences identified by
 CuffDiff.

 I am wondering whether one of my replicates might be an outlier. Does
 anybody have a suggestion on how to search for an outlier? The quality
 statistics of the unprocessed data looked equally good for all samples, so I
 don't think that this is a problem.

 Thanks,
 Dave


 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

   http://lists.bx.psu.edu/



-- 
Ross Lazarus MBBS MPH;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444
http://scholar.google.com/citations?hl=enuser=UCUuEM4J
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] RSEM on Galaxy

2012-10-05 Thread Ross
Alicia,

Some relevant details usually make it easier for list members to help
you - please see
http://wiki.g2.bx.psu.edu/Support#Public_mailing_list_Q_.26_A_discussions

Did you load the offending tool from a toolshed or is it a tool
wrapper you wrote yourself? If the latter, I have two suggestions:

1) Search paster.log for your tool name and make sure it got correctly
loaded when you restarted Galaxy - xml syntax errors like misplaced 
characters will break the tool loading process.

2) if the log shows that it loaded, make sure you refresh the tool
frame in your browser after restarting Galaxy - eg clicking the
analyse data tab will get rid of the old cached copy.


On Sat, Oct 6, 2012 at 10:02 AM, Alicia R. Pérez-Porro
alicia.r.perezpo...@gmail.com wrote:
 Hi all,

 Recently i installed RSEM into Galaxy.
 The problem is that now i cannot find the tool when i open my galaxy in my
 computer. I checked and rsem folder is inside the tool folder.
 Any suggestions?

 Thanks,
 Alicia.





 --
 Alicia R. Pérez-Porro
 PhD student

 Giribet lab
 Department of Organismic and Evolutionary Biology
 MCZ labs
 Harvard University
 26 Oxford St, Cambridge MA 02138
 phone: +1 617-496-5308
 fax: +1 617-495-5667
 www.oeb.harvard.edu/faculty/giribet/

 Department of Marine Ecology
 Center for Advanced Studies of Blanes (CEAB-CSIC)
 C/Accés Cala St. Francesc 14
 17300 Blanes, Girona, SPAIN
 phone: +34 972 336 101
 fax: +34 972 337 806
 www.ceab.csic.es


 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

   http://lists.bx.psu.edu/



-- 
Ross Lazarus MBBS MPH;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444
http://scholar.google.com/citations?hl=enuser=UCUuEM4J

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Cuffdiff no without replicates

2012-10-03 Thread Ross
On Wed, Oct 3, 2012 at 9:11 PM, i b ibse...@gmail.com wrote:
 Dear all,
 how reliable is running Cuffdiff without replicates? e.g.one samples
 agains another one?

 Is it statistically makign any difference when using replicates?

Seqanswers might be a better place to ask this very interesting
technical question that goes way beyond Galaxy...

My 2c: Statistically speaking, sequencing and biology are both noisy.
Replicates provide information about non-experimental (technical and
biological) variation. That variation is usually not the variation you
are looking for, but if you want to remove it, you have to model it
and that requires information from replicates (or really good
guesswork). In some situations (eg extreme experimental conditions),
I'm sure you'll find biologically meaningful signal without them but
in my experience, they can really help to decrease non-experimental
noise, particularly where the experimental condition induces only
subtle changes in transcript abundance.

You could always analyse a data set with replicates and compare the
results with and without those replicates yourself to see what happens
- it would be a nice paper I'm sure.


 Thanks,
 ib
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Bam to Fastq conversion

2012-05-25 Thread Ross
There's a picard sam to fastq converter which AFAIK works on bams too:

https://main.g2.bx.psu.edu/tool_runner?tool_id=picard_SamToFastq


On Sat, May 26, 2012 at 9:48 AM, William M. Strauss
cyclotouri...@mac.com wrote:
 I have a bunch of BAM files that I need to convert to FASTQ, are there such 
 tools in Galaxy?

 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/



-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Alignment coordinates

2012-04-13 Thread Ross
Erika,

Although Galaxy doesn't have a suitable mapping tool for fasta format
sequences, you can use the Galaxy clustalw tool option to output a
fasta format file and then upload that fasta file to the ncbi Blast
tool at http://blast.ncbi.nlm.nih.gov/Blast.cgi to find where the
sequences align - blast allows mapping of sequences on multiple
organisms.

I hope this helps in your research

On Fri, Apr 13, 2012 at 10:36 AM, Jennifer Jackson j...@bx.psu.edu wrote:
 Hi Erika,

 Unfortunately, Galaxy does not have a parser tool to transform clustal
 datatypes into tab-coordinate based files.

 Sorry we couldn't help!

 Best,

 Jen
 Galaxy team

 On 4/12/12 8:07 AM, Erika Kruse wrote:

 Hello,

 I imported a sequence alignment file from clustalw. How can I generate a
 list of the coordinates of the matching regions?

 Thank you.

 Erika



 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

   http://lists.bx.psu.edu/

 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/



-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Quick question on generated tables

2012-03-13 Thread Ross
Hi, Carly.

Hover your mouse pointer over the little floppy disk icon and the help
text 'download' should appear - click the icon to download the
contents of the file to your local workstation. It's an interval file
so it will be tab delimited and should open easily in your favourite
spreadsheet program.

Glad to hear you're enjoying using Galaxy - I hope this helps you get
your work done

On Wed, Mar 14, 2012 at 9:54 AM, Carly Hom carlyho...@gmail.com wrote:
 Hello,

 I am working on a project that involves extracting a list of promoter
 regions that contain a significant enough H3k27me3 signal. So far I have
 produced an output in the ENCODE genome browser which is great for the
 visual representations I will be needed, but I also need to extract the
 table that was generated. I see the first few lines in a snapshot of the
 data that galaxy provided me with, but how do I extract that entire table
 into spreadsheet or txt format? If you could enlighten me on a galaxy
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Problems starting Galaxy Cloudman

2012-03-09 Thread Ross
-linux-x86_64-ucs4.egg,
  /mnt/galaxyTools/galaxy-central/eggs/Mako-0.4.1-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/eggs/WebHelpers-0.2-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/eggs/simplejson-2.1.1-py2.6-linux-x86_64-ucs4.egg,
  /mnt/galaxyTools/galaxy-central/eggs/wchartype-0.1-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/eggs/elementtree-1.2.6_20050316-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/eggs/docutils-0.7-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/eggs/WebOb-0.8.5-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/eggs/Routes-1.12.3-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/eggs/Cheetah-2.2.2-py2.6-linux-x86_64-ucs4.egg,
  /mnt/galaxyTools/galaxy-central/eggs/PasteDeploy-1.3.3-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/eggs/PasteScript-1.7.3-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/eggs/Paste-1.6-py2.6.egg,
  /mnt/galaxyTools/galaxy-central/lib, /usr/lib/python2.6/,
  /usr/lib/python2.6/plat-linux2, /usr/lib/python2.6/lib-tk,
  /usr/lib/python2.6/lib-old, /usr/lib/python2.6/lib-dynload
  Traceback (most recent call last):
    File /mnt/galaxyTools/galaxy-central/lib/galaxy/web/buildapp.py,
  line 82, in app_factory
      app = UniverseApplication( global_conf = global_conf, **kwargs )
    File /mnt/galaxyTools/galaxy-central/lib/galaxy/app.py, line 24, in
  __init__
      self.config.check()
    File /mnt/galaxyTools/galaxy-central/lib/galaxy/config.py, line 243,
  in check
      tree = parse_xml( config_filename )
    File /mnt/galaxyTools/galaxy-central/lib/galaxy/util/__init__.py,
  line 105, in parse_xml
      tree = ElementTree.parse(fname)
    File
  /mnt/galaxyTools/galaxy-central/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py,
  line 859, in parse
      tree.parse(source, parser)
    File
  /mnt/galaxyTools/galaxy-central/eggs/elementtree-1.2.6_20050316-py2.6.egg/elementtree/ElementTree.py,
  line 576, in parse
      source = open(source, rb)
  IOError: [Errno 2] No such file or directory:
  './migrated_tools_conf.xml'
  Removing PID file paster.pid
 
 
  While I'm here, I see a new Galaxy Cloudman AMI
   072133624695/galaxy-cloudman-2012-02-26.  I can't manage to start that, I
  get an error as below, with all types of instance,
  (tiny/small/medium/large). Is that a recommended AMI now ?  It would be 
  good
  to have a new updated AMI.
 
  snap1.png
 
  Thanks !
 
  --
  Greg Edwards,
  Port Jackson Bioinformatics
  gedwar...@gmail.com
 
  ___
  The Galaxy User list should be used for the discussion of
  Galaxy analysis and other features on the public server
  at usegalaxy.org.  Please keep all replies on the list by
  using reply all in your mail client.  For discussion of
  local Galaxy instances and the Galaxy source code, please
  use the Galaxy Development list:
 
   http://lists.bx.psu.edu/listinfo/galaxy-dev
 
  To manage your subscriptions to this and other Galaxy lists,
  please use the interface at:
 
   http://lists.bx.psu.edu/




 --
 Greg Edwards,
 Port Jackson Bioinformatics
 gedwar...@gmail.com


 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/



-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] RE : % On-Off target

2011-12-20 Thread Ross
Hi Antoine.
Thanks - sorry, I haven't analysed that particular platform - anyone?

http://picard.sourceforge.net/picard-metric-definitions.shtml#HsMetrics
seems comprehensive...
Why not try the Galaxy picard Hybrid Selection metrics tool on your data?

On Wed, Dec 21, 2011 at 6:25 PM, Antoine ROUSSELIN
a.rousse...@baclesse.fr wrote:

 Hello,

 My English is very bad, it's a shame!

 My I have a file for my baits (create a bed file of your Agilent SureSelect 
 targets with exons and introns captured). Sequencing output, I get the files. 
 Bam, my result sets of alignments on the human genome complete (with CASAVA). 
 I try to get the coverage I only exonic regions (+ or - 50 bp).

 Thanks

 ROUSSELIN Antoine
 Clinical Biology and Oncology Laboratory
 Centre François BACLESSE
 France/Caen
 a.rousse...@baclesse.fr
 Tel : (33) 02.31.45.40.44
 Fax : (33) 02.31.45.50.53



  Message d'origine
 De: Ross [mailto:ross.laza...@gmail.com]
 Date: mer. 12/21/2011 03:16
 À: Antoine ROUSSELIN
 Cc: galaxy-user@lists.bx.psu.edu
 Objet : Re: [galaxy-user] % On-Off target

 Hello Antoine,
 I'm not sure I really understand your question but if the metrics described
 at http://picard.sourceforge.net/picard-metric-definitions.shtml#HsMetrics are
 of use, you could try the picard hybrid selection metrics Galaxy tool or
 use it on the command line. Otherwise perhaps you can get a more helpful
 response if you provide a clear explanation of the data formats you have
 and the measures you want.


 On Wed, Dec 21, 2011 at 2:59 AM, Antoine ROUSSELIN
 a.rousse...@baclesse.frwrote:

  **
 
  hello,
  I'm looking for a tool (or command line) to determine the % On-Off target
  + or - 50 bp of exon from my capture file but not annotated (!!!). Capture
  SureSelect agilent home.
 
  Current pipeline:
  GAIIx Illumina
  CASAVA1.8
  IGV
  CNV-seq
  SAMtools
  BEDtools
  GALAXY
  NextGENe
 
  Please HELP
 
  ROUSSELIN Antoine
  Clinical Biology and Oncology Laboratory
  Centre François BACLESSE
  France/Caen
  a.rousse...@baclesse.fr
  Tel : (33) 02.31.45.40.44
  Fax : (33) 02.31.45.50.53
 
 
  ___
  The Galaxy User list should be used for the discussion of
  Galaxy analysis and other features on the public server
  at usegalaxy.org.  Please keep all replies on the list by
  using reply all in your mail client.  For discussion of
  local Galaxy instances and the Galaxy source code, please
  use the Galaxy Development list:
 
   http://lists.bx.psu.edu/listinfo/galaxy-dev
 
  To manage your subscriptions to this and other Galaxy lists,
  please use the interface at:
 
   http://lists.bx.psu.edu/
 



 --
 Ross Lazarus MBBS MPH;
 Associate Professor, Harvard Medical School;
 Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;




--
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] history was empty

2011-12-05 Thread Ross
Hi, Weimin,

What do you see when you select Options | Saved Histories (top right of the
history panel)?

If that doesn't show your old histories to re-open, please let us know
exactly which galaxy URL you were using and what login name you used so we
can take a look.

Note that you need to be logged in to the same user account each time for
histories to be reliably preserved between sessions.


On Tue, Dec 6, 2011 at 12:55 PM, dongdong zhaoweiming 
zhaoweiming1...@yahoo.com.cn wrote:

 Hi,

 I open my galaxy and found the history was empty, I could not found my
 previous history, so how can I find it? Thanks a lot!

 Reards
 weimin zhao

 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/




-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/

Re: [galaxy-user] more on varying number of output files in xml

2011-11-09 Thread Ross
Hi, Nicholas,

You'll almost certainly want to write a wrapper to create the plink
command line and run it - a wrapper script can construct a correct
plink command line and then do all sorts of post-plink transformation
on the outputs as needed - which in my experience it usually is.

Most of the rgenetics tools do just that so looking at the source
under tools/rgenetics may provide some prototypes you can change to
suit your needs - eg rgQC.py and rgQC.xml

Plink spews out all sorts of stuff so you may want to explore the Html
datatype - see 
http://lists.bx.psu.edu/pipermail/galaxy-dev/2010-September/003311.html
for a brief explanation.

On Wed, Nov 9, 2011 at 1:31 AM, Nicholas Robinson
nicholas.robin...@nofima.no wrote:
 Hi Galaxy users,

 I am trying to write a simple tool that sends commands to shell to run Plink
 (and other analysis packages set up on our Galaxy server). I am new to this,
 but have managed to write some tools before that work in a similar fashion,
 at least when you can specify what input and output files will be produced.
 For plink there are a large number of options and different outputs
 possible. I have seen the discussion on the user group about how to handle
 multiple output files (eg.
 http://lists.bx.psu.edu/pipermail/galaxy-user/2009-September/000743.html).
 Normally to run plink you specify a single file name (eg. --out $output1)
 and the program can produce two, to a few, output files (eg depending on the
 analysis it might produce a $output1.log file and a $output1.cmh file if I
 do a certain test, otherwise it might produce the log file and two other
 files). If I add:

 --out $output1 $output1.id $__new_file_path__

 to the command line to try to capture all the output, as suggested for when
 the interpreter is python, then when I run the tool in galaxy it says:

 ERROR: Problem parsing the command line arguments.

 ie. plink makes a fuss about the addition to the command line (I suspect).
 Have any of you figured out a way to handle varying numbers of multiple
 output files under these circumstances? Please give me a simple reponse if
 you can, I am a new user.

 Cheers,

 Nick
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/




-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Application error

2011-11-09 Thread Ross
Hi, Sarah.

The message you saw:

AssertionError: ##rgFastQC.py error - cannot find executable 
/Users/allabyrg/Desktop/galaxy-dist/tool-data/shared/jars/FastQC/fastqc

suggests that the wrapper can't find the required FastQC perl wrapper
called fastqc. For complicated reasons, it needs to be available as
part of a complete FastQC installation as it calls the various java
components and expects to find them where the FastQC distribution puts
them.

Please try unpacking the FastQC distribution archive into that
.../tool-data/shared/jars directory - no galaxy restart required - and
see if that helps?


On Wed, Nov 9, 2011 at 9:35 AM, Palmer, Sarah s.a.pal...@warwick.ac.uk wrote:
 Hi,
 I have been trying to run fastqc on our standalone installations of Galaxy
 and repeatedly get the attached error message. I have tried this now with
 both 454 and Illumina fastq files, on 2 different Mac Machines with 10.5 and
 10.6 OS and Python versions 2.5 and 2.7 on one of the machines and get the
 same problem occurring each time.
 Could you please help me find a fix for this?
 Cheers,
 Sarah
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/




-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] GATK best practices with local installation of Galaxy

2011-08-03 Thread Ross
Hi, Camille,

I can see this really needs a 'proper' fix - preferably taking
advantage of the automated header merge.
Preserving the metadata from each bam automatically is safer and less
error-prone but you could use the existing Replace sam/bam header
tool to do the surgery once you have a correct header in SAM format in
your history?

I'm currently testing changes which replace the current samtools merge
code with a call to Picard MergeSamFiles.
I'll add a switch to control whether all input headers are merged in
case there are situations where it's not wanted.

I'll let you know when you can try it out on our test instance and
which revision of the galaxy-central repository contains the changes
so you can get it working on your local installation.

On Wed, Aug 3, 2011 at 11:49 PM, Camille Stephan
camille.step...@irbbarcelona.org wrote:
 Hi Ross,
 thanks for your answer. I found a dirty fix for merging pairs of bam files,
 had to change a couple of things in my local installation though.

 - Add group reads to each BAM file separately using Picard's Add or Replace
 Groups (with ID=s1 and ID=s2 for each file)
 - Create the rg.txt file containing something like this:

 @RG ID:s1    SM:s1    LB:s1    PL:Illumina
 @RG ID:s2    SM:s2    LB:s2    PL:Illumina

 Modify sam_merge.py to call:

 samtools merge -rh path/to/rg.txt %s %s...

 It works. The problem is all (pairs of) files will end up with the same IDs
 and labels, unless the rg.txt file is changed every time.
 Would it be very difficult to add to the Galaxy wrapper the option of
 creating rg.txt on the fly and adding the -h option to the samtools call?

 I'm not familiar with creating wrappers for Galaxy, any suggestion as to
 where to start?

 Thanks again,
 Camille



 On Wed, Aug 3, 2011 at 2:34 PM, Ross ross.laza...@gmail.com wrote:

 Camille, thanks for reporting this - I think you have found a bug.
 We definitely need to be able to preserve metadata when we merge bams.
 Thanks for your suggestion of using mergeSamFiles - yes, I think it
 might be a good fix for this problem - but it will take a little while
 and won't reach the Main site for a few weeks once it's done. It is
 possible to write your own wrapper locally if you need it fast.
 Sorry for the inconvenience and thanks again.

 On Wed, Aug 3, 2011 at 6:15 PM, Camille Stephan
 camille.step...@irbbarcelona.org wrote:
  Hello guys,
  I'm trying to run a pipeline of the best practices for snp and indel
  discovery as described by the people at Broad and I'm running into
  troubles
  with the GATK tools in a local installation of Galaxy.
  The main problem I have is that merging bam files with the samtools
  merge
  tool doesn't keep read group for each sample, causing Count Covariates
  to
  crash. The pipeline works fine with a single bam file, but I need to
  realign
  at least two files at a time.
  Is there a way to set the read group of a merged bam inside Galaxy? Are
  there plans to include the merge tool from Picard in Galaxy? Is there
  an
  easy way for me to do this locally? (Although I would like to run this
  in
  the cloud later on when the workflow is ready).
 
  Thanks!
  Camille
 
  --
  ***
  Camille Stephan-Otto Attolini, PhD
  Senior Research Officer, Bioinformatics and Biostatistics unit
  IRB Barcelona
  Tel (+34) 93 402 0553
 
 
  ___
  The Galaxy User list should be used for the discussion of
  Galaxy analysis and other features on the public server
  at usegalaxy.org.  Please keep all replies on the list by
  using reply all in your mail client.  For discussion of
  local Galaxy instances and the Galaxy source code, please
  use the Galaxy Development list:
 
   http://lists.bx.psu.edu/listinfo/galaxy-dev
 
  To manage your subscriptions to this and other Galaxy lists,
  please use the interface at:
 
   http://lists.bx.psu.edu/
 



 --
 Ross Lazarus MBBS MPH;
 Associate Professor, Harvard Medical School;
 Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
 Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;



 --
 ***
 Camille Stephan-Otto Attolini, PhD
 Senior Research Officer, Bioinformatics and Biostatistics unit
 IRB Barcelona
 Tel (+34) 93 402 0553





-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface

Re: [galaxy-user] GATK best practices with local installation of Galaxy

2011-08-03 Thread Ross
Hi, Camille.

If you can find some time to upload some of your bam files, could you
please test the revised bam merge tool on http://test.g2.bx.psu.edu/
and let me know how you go. This won't be on the main site until the
next scheduled update in a few weeks.

If you need this locally, the changes are in galaxy-central from where
anyone can grab them - the key file you need to update is
tools/samtools/sam_merge.xml and you'll also need MergeSamFiles.jar
from a recent Picard release to be available in your
tool-data/shared/jars directory.

Hope this helps - thanks for pointing out the bug.

On Thu, Aug 4, 2011 at 12:02 PM, Ross ross.laza...@gmail.com wrote:
 Hi, Camille,

 I can see this really needs a 'proper' fix - preferably taking
 advantage of the automated header merge.
 Preserving the metadata from each bam automatically is safer and less
 error-prone but you could use the existing Replace sam/bam header
 tool to do the surgery once you have a correct header in SAM format in
 your history?

 I'm currently testing changes which replace the current samtools merge
 code with a call to Picard MergeSamFiles.
 I'll add a switch to control whether all input headers are merged in
 case there are situations where it's not wanted.

 I'll let you know when you can try it out on our test instance and
 which revision of the galaxy-central repository contains the changes
 so you can get it working on your local installation.

 On Wed, Aug 3, 2011 at 11:49 PM, Camille Stephan
 camille.step...@irbbarcelona.org wrote:
 Hi Ross,
 thanks for your answer. I found a dirty fix for merging pairs of bam files,
 had to change a couple of things in my local installation though.

 - Add group reads to each BAM file separately using Picard's Add or Replace
 Groups (with ID=s1 and ID=s2 for each file)
 - Create the rg.txt file containing something like this:

 @RG ID:s1    SM:s1    LB:s1    PL:Illumina
 @RG ID:s2    SM:s2    LB:s2    PL:Illumina

 Modify sam_merge.py to call:

 samtools merge -rh path/to/rg.txt %s %s...

 It works. The problem is all (pairs of) files will end up with the same IDs
 and labels, unless the rg.txt file is changed every time.
 Would it be very difficult to add to the Galaxy wrapper the option of
 creating rg.txt on the fly and adding the -h option to the samtools call?

 I'm not familiar with creating wrappers for Galaxy, any suggestion as to
 where to start?

 Thanks again,
 Camille



 On Wed, Aug 3, 2011 at 2:34 PM, Ross ross.laza...@gmail.com wrote:

 Camille, thanks for reporting this - I think you have found a bug.
 We definitely need to be able to preserve metadata when we merge bams.
 Thanks for your suggestion of using mergeSamFiles - yes, I think it
 might be a good fix for this problem - but it will take a little while
 and won't reach the Main site for a few weeks once it's done. It is
 possible to write your own wrapper locally if you need it fast.
 Sorry for the inconvenience and thanks again.

 On Wed, Aug 3, 2011 at 6:15 PM, Camille Stephan
 camille.step...@irbbarcelona.org wrote:
  Hello guys,
  I'm trying to run a pipeline of the best practices for snp and indel
  discovery as described by the people at Broad and I'm running into
  troubles
  with the GATK tools in a local installation of Galaxy.
  The main problem I have is that merging bam files with the samtools
  merge
  tool doesn't keep read group for each sample, causing Count Covariates
  to
  crash. The pipeline works fine with a single bam file, but I need to
  realign
  at least two files at a time.
  Is there a way to set the read group of a merged bam inside Galaxy? Are
  there plans to include the merge tool from Picard in Galaxy? Is there
  an
  easy way for me to do this locally? (Although I would like to run this
  in
  the cloud later on when the workflow is ready).
 
  Thanks!
  Camille
 
  --
  ***
  Camille Stephan-Otto Attolini, PhD
  Senior Research Officer, Bioinformatics and Biostatistics unit
  IRB Barcelona
  Tel (+34) 93 402 0553
 
 
  ___
  The Galaxy User list should be used for the discussion of
  Galaxy analysis and other features on the public server
  at usegalaxy.org.  Please keep all replies on the list by
  using reply all in your mail client.  For discussion of
  local Galaxy instances and the Galaxy source code, please
  use the Galaxy Development list:
 
   http://lists.bx.psu.edu/listinfo/galaxy-dev
 
  To manage your subscriptions to this and other Galaxy lists,
  please use the interface at:
 
   http://lists.bx.psu.edu/
 



 --
 Ross Lazarus MBBS MPH;
 Associate Professor, Harvard Medical School;
 Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
 Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;



 --
 ***
 Camille Stephan-Otto Attolini, PhD
 Senior Research Officer, Bioinformatics and Biostatistics unit
 IRB Barcelona
 Tel (+34) 93 402 0553





 --
 Ross Lazarus MBBS MPH

Re: [galaxy-user] Weblogo results empty

2011-07-07 Thread Ross
Holger, thanks for finding those errors - I'll take a look shortly.

What's the 'dependency directory' ? I don't think the wrapper knows
anything about it.
The tool assumes that the weblogo script is on the path when the
actual job starts executing - wherever that is.

If the weblogo script produces output it must be on the path after you
source that script and working I think.
So, the environment on the execution node must include the relevant
path. Otherwise it won't work.

What path does the execution host get when a galaxy job is run?
Does it include the right path to that weblogo script (marked executable)?
Can the user each job runs as execute it?


On Thu, Jul 7, 2011 at 7:56 PM, Holger Klein h.kl...@imb-mainz.de wrote:
 Hi Ross,

 On 07/07/2011 02:07 AM, Ross wrote:

 Please try the new version 0.4 of the weblogo wrapper in
 galaxy-central #5772 - it has additional error reporting that may help
 clarify dependency or other problems and let me know how you go?

 thanks, with the new version I get some more hints.
 It seems that there is a problem with the path.

 Just having weblogo installed in the dependency directory and using the
 env.sh mechanism to set the path, the wrapper doesn't find the
 executable at all:
 ---%---
 ## rgWebLogo3.py error - cannot locate the weblogo binary weblogo on the
 current path
 ## Please ensure it is installed and working from
 http://code.google.com/p/weblogo
 ---%---

 When I put a soft link to a directory which is in the galaxy user's
 static path, I get a different error:
 ---%---
 Traceback (most recent call last):
  File
 /local/data/home/galaxy/galaxy-dist/tools/rgenetics/rgWebLogo3.py,
 line 156, in
    checks,s = w.run()
  File
 /local/data/home/galaxy/galaxy-dist/tools/rgenetics/rgWebLogo3.py,
 line 127, in run
    s = self.runCL()
  File
 /local/data/home/galaxy/galaxy-dist/tools/rgenetics/rgWebLogo3.py,
 line 47, in runCL
    print  sys.stderr, '## rgWebLogo3.py error - executing %s returned
 error code %d' % cl
 TypeError: not enough arguments for format string
 ---%---

 Line 47 seems to lack the variable for the return code, when changing
 the line to

             print  sys.stderr, '## rgWebLogo3.py error - executing %s
 returned error code %d' % (cl, rval)

 I get the following message:
 ---%---
 ## rgWebLogo3.py error - executing weblogo -F png -c auto -o
 /local/data/galaxy_files/000/dataset_304.dat -U bits -t
 Galaxy-Rgenetics Sequence Logo -f
 /local/data/galaxy_files/000/dataset_286.dat -s large returned error code 1
 ## This may be a data problem or a tool dependency (weblogo)
 installation problem
 ## Please ensure weblogo is correctly installed and working on the
 command line -see http://code.google.com/p/weblogo
 ---%---

 So it still seems to boil down to my local weblogo installation.
 Sourcing the respective env.sh and executing the above command line, I
 get a valid png though (again with the warning mentioned before):
 ---%---
 galaxy@imbc1:~/tmp/1$ source ~/dependencies/weblogo/default/env.sh
 galaxy@imbc1:~/tmp/1$ weblogo -F png -c auto -o
 /local/data/galaxy_files/000/dataset_304.dat -U bits -t
 Galaxy-Rgenetics Sequence Logo -f
 /local/data/galaxy_files/000/dataset_286.dat -s large
 /home/galaxy/python/lib/python2.6/site-packages/CoreBio-0.5.0-py2.6.egg/corebio/seq_io/_nexus/__init__.py:19:
 DeprecationWarning: the sets module is deprecated
  import sets
 ---%---

 Could this be related to the way galaxy is setting the paths dynamically
 using the env.sh file? Do I have to adjust python paths in there as well?

 Regards,
 Holger





 On Tue, Jul 5, 2011 at 5:49 PM, Holger Klein h.kl...@imb-mainz.de wrote:
 Hi Ross,

 thanks for taking care of this issue.

 On 07/05/2011 12:31 AM, Ross wrote:

 Is this error seen on Galaxy main or test? If so please share the
 history with me so I can see the input and reproduce what sounds like
 a wrapper error?

 Otherwise, if this is on a private instance, and if the tool has never
 produced output successfully, then this may be a dependency
 installation problem - eg you may need to ensure that the weblogo3
 executable is available and working correctly on the path used by your
 execution nodes. To assure yourself that your data works with the
 tool, please try running it on main using the same data, and let me
 know what you see?

 in fact it's a private instance of galaxy, it's the latest version of
 galaxy-dist (hg summary: 5743:720455407d1c).
 The input data is fine, it's a clustalw alignment in fasta format which
 can be used by the weblogo module on galaxy main.

 Maybe some background info on the weblogo installation helps:
 it's located below the tool_dependency_dir as defined in
 universe_wsgi.ini in weblogo/3.0 (with default as a link to 3.0). It
 contains the file env.sh which sets the PATH:
 export PATH=/home/galaxy/dependencies/weblogo/3.0:$PATH

 Starting the weblogo executable with the galaxy virtualenv python seems
 to work (just tested --help), although it returns a warning

Re: [galaxy-user] Weblogo results empty

2011-07-07 Thread Ross
AFAIK, the requirements stuff is still work in progress?

Yes, of course you're right - there has to be a better way -
particularly where there are complex inter-dependencies like
weblogo/python/corebio

On Thu, Jul 7, 2011 at 10:41 PM, Holger Klein h.kl...@imb-mainz.de wrote:
 Hi Ross,

 On 07/07/2011 12:38 PM, Ross wrote:
 Holger, thanks for finding those errors - I'll take a look shortly.

 What's the 'dependency directory' ? I don't think the wrapper knows
 anything about it.

 it's defined in universe_wsgi.ini:

 # The directory containing tool dependencies
 tool_dependency_dir = /local/data/home/galaxy/dependencies

 We only recently installed galaxy locally, and Nate pointed me towards
 this way to handle external tool dependencies.
 This issue here is related:
 https://bitbucket.org/galaxy/galaxy-central/issue/82/fix-the-tag-set-in-the-tool-configs

 Until now when tools didn't work with that mechanism (CCAT, clustalw) I
 simply put a link into a directory which is in galaxy's path.



 The tool assumes that the weblogo script is on the path when the
 actual job starts executing - wherever that is.

 If the weblogo script produces output it must be on the path after you
 source that script and working I think.
 So, the environment on the execution node must include the relevant
 path. Otherwise it won't work.

 What path does the execution host get when a galaxy job is run?
 Does it include the right path to that weblogo script (marked executable)?
 Can the user each job runs as execute it?

 I got a step further. The above mentioned mechanism with putting a link
 to the weblogo executable simply into the path didn't work, because the
 weblogo script uses the system-wide python (from #!/usr/bin/env python)
 which doesn't have the corebio module installed. When changing the
 interpreter to #!/home/galaxy/python/bin/python (galaxy-specific
 virtualenv) it works. Somehow I just assumed that with setting the
 PYTHON variable in the startup script would be sufficient.

 So, now it works, but I wonder if there's a mechanism that is cleaner
 than hard-coding the python interpreter? Is there a way to tell galaxy
 or wrapper scripts to use a specific python version?

 Regards,
 Holger







 On Thu, Jul 7, 2011 at 7:56 PM, Holger Klein h.kl...@imb-mainz.de wrote:
 Hi Ross,

 On 07/07/2011 02:07 AM, Ross wrote:

 Please try the new version 0.4 of the weblogo wrapper in
 galaxy-central #5772 - it has additional error reporting that may help
 clarify dependency or other problems and let me know how you go?

 thanks, with the new version I get some more hints.
 It seems that there is a problem with the path.

 Just having weblogo installed in the dependency directory and using the
 env.sh mechanism to set the path, the wrapper doesn't find the
 executable at all:
 ---%---
 ## rgWebLogo3.py error - cannot locate the weblogo binary weblogo on the
 current path
 ## Please ensure it is installed and working from
 http://code.google.com/p/weblogo
 ---%---

 When I put a soft link to a directory which is in the galaxy user's
 static path, I get a different error:
 ---%---
 Traceback (most recent call last):
  File
 /local/data/home/galaxy/galaxy-dist/tools/rgenetics/rgWebLogo3.py,
 line 156, in
    checks,s = w.run()
  File
 /local/data/home/galaxy/galaxy-dist/tools/rgenetics/rgWebLogo3.py,
 line 127, in run
    s = self.runCL()
  File
 /local/data/home/galaxy/galaxy-dist/tools/rgenetics/rgWebLogo3.py,
 line 47, in runCL
    print  sys.stderr, '## rgWebLogo3.py error - executing %s returned
 error code %d' % cl
 TypeError: not enough arguments for format string
 ---%---

 Line 47 seems to lack the variable for the return code, when changing
 the line to

             print  sys.stderr, '## rgWebLogo3.py error - executing %s
 returned error code %d' % (cl, rval)

 I get the following message:
 ---%---
 ## rgWebLogo3.py error - executing weblogo -F png -c auto -o
 /local/data/galaxy_files/000/dataset_304.dat -U bits -t
 Galaxy-Rgenetics Sequence Logo -f
 /local/data/galaxy_files/000/dataset_286.dat -s large returned error code 1
 ## This may be a data problem or a tool dependency (weblogo)
 installation problem
 ## Please ensure weblogo is correctly installed and working on the
 command line -see http://code.google.com/p/weblogo
 ---%---

 So it still seems to boil down to my local weblogo installation.
 Sourcing the respective env.sh and executing the above command line, I
 get a valid png though (again with the warning mentioned before):
 ---%---
 galaxy@imbc1:~/tmp/1$ source ~/dependencies/weblogo/default/env.sh
 galaxy@imbc1:~/tmp/1$ weblogo -F png -c auto -o
 /local/data/galaxy_files/000/dataset_304.dat -U bits -t
 Galaxy-Rgenetics Sequence Logo -f
 /local/data/galaxy_files/000/dataset_286.dat -s large
 /home/galaxy/python/lib/python2.6/site-packages/CoreBio-0.5.0-py2.6.egg/corebio/seq_io/_nexus/__init__.py:19:
 DeprecationWarning: the sets module is deprecated
  import sets
 ---%---

 Could this be related

Re: [galaxy-user] Weblogo results empty

2011-07-06 Thread Ross
Thanks for your input and patience, Holger!

Please try the new version 0.4 of the weblogo wrapper in
galaxy-central #5772 - it has additional error reporting that may help
clarify dependency or other problems and let me know how you go?

On Tue, Jul 5, 2011 at 5:49 PM, Holger Klein h.kl...@imb-mainz.de wrote:
 Hi Ross,

 thanks for taking care of this issue.

 On 07/05/2011 12:31 AM, Ross wrote:

 Is this error seen on Galaxy main or test? If so please share the
 history with me so I can see the input and reproduce what sounds like
 a wrapper error?

 Otherwise, if this is on a private instance, and if the tool has never
 produced output successfully, then this may be a dependency
 installation problem - eg you may need to ensure that the weblogo3
 executable is available and working correctly on the path used by your
 execution nodes. To assure yourself that your data works with the
 tool, please try running it on main using the same data, and let me
 know what you see?

 in fact it's a private instance of galaxy, it's the latest version of
 galaxy-dist (hg summary: 5743:720455407d1c).
 The input data is fine, it's a clustalw alignment in fasta format which
 can be used by the weblogo module on galaxy main.

 Maybe some background info on the weblogo installation helps:
 it's located below the tool_dependency_dir as defined in
 universe_wsgi.ini in weblogo/3.0 (with default as a link to 3.0). It
 contains the file env.sh which sets the PATH:
 export PATH=/home/galaxy/dependencies/weblogo/3.0:$PATH

 Starting the weblogo executable with the galaxy virtualenv python seems
 to work (just tested --help), although it returns a warning:

 ~/python/bin/python ./weblogo
 --help/home/galaxy/python/lib/python2.6/site-packages/CoreBio-0.5.0-py2.6.egg/corebio/seq_io/_nexus/__init__.py:19:
 DeprecationWarning: the sets module is deprecated
  import sets

 I also tested putting a link to the weblogo executable in the PATH
 that's defined for the galaxy user (as opposed to the dependency dir
 mechanism, that I have to admit I don't fully understand yet), but that
 also doesn't work. Could this be an issue of PYTHONPATH needing to be
 adjusted?

 Regards,
 Holger






 Thanks again.

 On Tue, Jul 5, 2011 at 12:34 AM, Holger Klein h.kl...@imb-mainz.de wrote:
 Dear all,

 I have a problem with the weblogo tool.
 I have a clustalw alignment in fasta format that I'd like to visualize
 as a logo. The sequence logo module ends with a success (green box), the
 info tells me the amount and length of the input data. But the output is
 empty, there are no plots (no matter if I select jpg, png, pdf or text).
 The respective image can't be displayed because it contains errors or
 is empty in case of text.

 I suspect that the actual call of the weblogo tool doesn't succeed, but
 I didn't figure out yet on how to check this. Does anybody have hints on
 where to look?

 Cheers,
 Holger



___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] Fastq uploads

2011-06-11 Thread Ross
Keith,

TIP: Due to browser limitations, uploading files larger than 2GB is
guaranteed to fail. To upload large files, use the URL method (below)
or FTP (if enabled by the site administrator).

is written on the screen right below the files box for this very reason.



On Sat, Jun 11, 2011 at 6:50 PM, Keith Giles keithegi...@gmail.com wrote:
 I converted 2 fastq datasets, roughly 3 gb in size each. They have
 been queued up for about 24 hours now. Are they too big? What is the
 best way to upload large fastq datasets?

 Sent from my iPhone
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/




-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] FastQ Groomer and Compute Quality Statistics

2011-06-09 Thread Ross
On Thu, Jun 9, 2011 at 10:12 AM, John David Osborne ozb...@uab.edu wrote:
 Thanks Ross, I don't see it under my local install - are there any 
 pre-written scripts to integrate it with a local galaxy instance?

 I assume you are talking about this tool here:
 http://www.bioinformatics.bbsrc.ac.uk/projects/fastqc/

Hi, John.

it's on main and test - ie the FastQC wrapper is distributed with the
current stable and central branches so your local tool_conf.xml may be
out of date since it's not automagically refreshed from the distro
.sample ? If you do a diff of your local tool_conf.xml with the
current distributed sample, you should see the lines you need to add
which points to rgenetics/fastqc.xml

Thu,Jun 09 at 10:22am grep -i fastqc tool_conf.xml
   label text=FastQC: fastq/sam/bam id=fastqcsambam /
tool file=rgenetics/rgFastQC.xml /

Like everything else, you'll want to install the jar locally so it can
be found by the cluster - the default location is
tool-data/shared/jars/FastQC so the tool can find the fastqc perl
script (yes, I know...but it's worth it!)

command interpreter=python
rgFastQC.py -i $input_file -d $html_file.files_path -o $html_file
-n $out_prefix -f $input_file.ext -e
${GALAXY_DATA_INDEX_DIR}/shared/jars/FastQC/fastqc

I hope this helps?


  -John

 
 From: Ross [ross.laza...@gmail.com]
 Sent: Wednesday, June 01, 2011 11:41 AM
 To: John David Osborne
 Cc: galaxy-u...@bx.psu.edu
 Subject: Re: [galaxy-user] FastQ Groomer and Compute Quality Statistics

 You can avoid the space/time overhead of grooming and get
 comprehensive QC reports using the new wrapper for FastQC (under NGS:
 QC) - it takes fastq of any flavour (and bam) groomed or not,
 producing a superset of the compute quality stats output without the
 need for an intermediate step. Highly recommended.

 On Wed, Jun 1, 2011 at 12:02 PM, John David Osborne ozb...@uab.edu wrote:
 I noticed that for our new Ilumina data (which generate Sanger format) the
 FastQ groomer output is identical to the Ilumina FastQ input file.

 I was hoping to go ahead and just use the raw FastQ files as input (saving
 disk space) for computing quality statistics to look at box plots, but it
 appears that the tool Compute Quality Statistics appears to require that
 the data have been run through FastQ Groomer first.

 Is there a way to get around this and is this a bug? I assuming this is some
 sort of safety measure built into this tool?

  -John

 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/




 --
 Ross Lazarus MBBS MPH;
 Associate Professor, Harvard Medical School;
 Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
 Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;



-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] FastQ Groomer and Compute Quality Statistics

2011-06-01 Thread Ross
You can avoid the space/time overhead of grooming and get
comprehensive QC reports using the new wrapper for FastQC (under NGS:
QC) - it takes fastq of any flavour (and bam) groomed or not,
producing a superset of the compute quality stats output without the
need for an intermediate step. Highly recommended.

On Wed, Jun 1, 2011 at 12:02 PM, John David Osborne ozb...@uab.edu wrote:
 I noticed that for our new Ilumina data (which generate Sanger format) the
 FastQ groomer output is identical to the Ilumina FastQ input file.

 I was hoping to go ahead and just use the raw FastQ files as input (saving
 disk space) for computing quality statistics to look at box plots, but it
 appears that the tool Compute Quality Statistics appears to require that
 the data have been run through FastQ Groomer first.

 Is there a way to get around this and is this a bug? I assuming this is some
 sort of safety measure built into this tool?

  -John

 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

  http://lists.bx.psu.edu/




-- 
Ross Lazarus MBBS MPH;
Associate Professor, Harvard Medical School;
Director of Bioinformatics, Channing Lab; Tel: +1 617 505 4850;
Head, Medical Bioinformatics, BakerIDI; Tel: +61 385321444;

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/