Re: [galaxy-user] cufflinks analysis using .bam files generated by LifeScope (ABI 5500 Sequencer)

2012-12-18 Thread Jennifer Jackson

Hello Davide,

If the data does represent alignments (not just sequence), then there is 
one more item to check for. Another team member reminded me that 
Cufflinks requires an XS custom tag in an input SAM file (or any 
compressed BAM file that represents it). Details are here from the 
Cufflinks manual:


http://cufflinks.cbcb.umd.edu/manual.html#cufflinks_input :

--

Cufflinks takes a text file of SAM alignments, or a binary SAM (BAM) 
file as input. For more details on the SAM format, see the specification 
http://samtools.sourceforge.net/SAM1.pdf. The RNA-Seq read mapper 
TopHat http://tophat.cbcb.umd.edu/ produces output in this format, and 
is recommended for use with Cufflinks. However Cufflinks will accept SAM 
alignments generated by any read mapper. Here's an example of an 
alignment Cufflinks will accept:



s6.25mer.txt-913508 16  chr1 4482736 255 14M431N11M * 0 0 \
   CAAGATGCTAGGCAAGTCTTGGAAG I NM:i:0 XS:A:-


Note the use of the custom tag XS. This attribute, which must have a 
value of + or -, indicates which strand the RNA that produced this 
read came from. While this tag can be applied to any alignment, 
including unspliced ones, it *must* be present for all spliced alignment 
records (those with a 'N' operation in the CIGAR string).


This should be fairly easy to check for now that the data is in 
uncompressed SAM format. Running the pipeline starting with Tophat may 
be the best choice. If you using the public Main server and have 
continued problems not covered in our tutorial/FAQ, a bug report can be 
submitted from error datasets:

http://wiki.galaxyproject.org/Support#Reporting_tool_errors

Hopefully this helps,

Jen
Galaxy team
On 12/17/12 11:46 PM, Davide Degli Esposti wrote:

Dear Jen,

Thank you very much for your help. As you mentioned, it seems likely a 
problem of the input. I tried to transform the bam file in sam at your 
Galaxy site and then to run cufflinks and I have got the three output 
files. Do you think it is an acceptable way to avoid the obstacle? I 
am going to perform some controls checking for the results obtained 
from other methods.


Thank you again for your help
Best,

davide
---
Davide Degli Esposti, PhD
Epigenetic (EGE) Group
International Agency for Research on Cancer
Tel. +33 4 72738036
Fax. +33 4 72738322
150, cours Albert Thomas
69372 Lyon Cedex 08
France

*From:* Jennifer Jackson [j...@bx.psu.edu]
*Sent:* Tuesday, December 18, 2012 2:53 AM
*To:* Davide Degli Esposti
*Cc:* galaxy-user@lists.bx.psu.edu
*Subject:* Re: [galaxy-user] cufflinks analysis using .bam files 
generated by LifeScope (ABI 5500 Sequencer)


Hello Davide,

The fact that you are not getting any error points to some problem 
with the input. Perhaps you are sending just sequence data in BAM 
format to Cufflinks, without any alignment performed first? Some sort 
of error would be expected for most other cases, but this is not the 
Galaxy server our team hosts, it is difficult to state exactly what 
the issue may be, just offer suggestions.


Tophat will require fastq files as input, unless this alternate Galaxy 
site has a modified wrapper. Then the alignments generated by Tophat 
(or another alignment tool, sometimes Bowtie is used) in BAM format 
are the input to Cuffinks (along with other optional data).


If your data are aligned BAM, and you continue to have problems with 
this alternate Galaxy site, it would be best to contact the group that 
runs it - the information is on their home page (middle panel) when 
you follow the url.


You could also decide to use the public Galaxy instance run by our 
core project team at http://usegalaxy.org, if we have the tool set you 
wish to use. A generalized tutorial for RNA-seq analysis is available 
here:

http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise
And some troubleshooting help here:
http://wiki.galaxyproject.org/Support#Tools_on_the_Main_server

The tool author's original documentation would be good to review as well:
http://tophat.cbcb.umd.edu
http://cufflinks.cbcb.umd.edu

Best,

Jen
Galaxy team


On 12/13/12 11:47 AM, Davide Degli Esposti wrote:

Hello,

I am new in using Galaxy and I am working on .bam files generated by 
our sequencing platform, using the LifeScope software associated to 
ABI 5500 sequencer. I uploaded my files on a galaxy browser ( 
http://galaxy.raetschlab.org) and I tried to run cufflink assemble 
and quantify reads expression levels for each file. However, when I 
run cufflinks (using default parameters) the output is an empty file.
What is going wrong? Should I use special parameters? Are the .bam 
files generated by LifeScope suitable for cufflink analysis or should 
I transform the xsq ABI output in a fastq and then apply TopHat?


I thank you very much for your help

Davide

---
Davide Degli Esposti, PhD
Epigenetic (EGE) Group
International Agency for Research on 

[galaxy-user] How to Provide .gtf file to tophat2

2012-12-18 Thread greg
I'm trying to translate this command line run into the Galaxy GUI but
I'm not seeing a place to specify the .gtf file.

Does anyone know?

tophat2 -p 8 -G /groups/bowtie2_index/mRNA.gtf -o /groups/hp_1
/groups/genome /groups/reads.fastq

thanks,

Greg
___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/


Re: [galaxy-user] How to Provide .gtf file to tophat2

2012-12-18 Thread greg
That's very helpful!  Thanks.

-Greg

On Tue, Dec 18, 2012 at 11:39 AM, Jennifer Jackson j...@bx.psu.edu wrote:
 Hello Greg,

 Open up TopHat settings to use: to Full Parameter list to find the
 option: Use Own Junctions: and set this to Yes. There will be a new
 option underneath it for Use Gene Annotation Model: - which is where the
 GTF file is entered when set to Yes.

 Hopefully this helps,

 Jen
 Galaxy team


 On 12/18/12 8:10 AM, greg wrote:

 I'm trying to translate this command line run into the Galaxy GUI but
 I'm not seeing a place to specify the .gtf file.

 Does anyone know?

 tophat2 -p 8 -G /groups/bowtie2_index/mRNA.gtf -o /groups/hp_1
 /groups/genome /groups/reads.fastq

 thanks,

 Greg
 ___
 The Galaxy User list should be used for the discussion of
 Galaxy analysis and other features on the public server
 at usegalaxy.org.  Please keep all replies on the list by
 using reply all in your mail client.  For discussion of
 local Galaxy instances and the Galaxy source code, please
 use the Galaxy Development list:

http://lists.bx.psu.edu/listinfo/galaxy-dev

 To manage your subscriptions to this and other Galaxy lists,
 please use the interface at:

http://lists.bx.psu.edu/


 --
 Jennifer Jackson
 http://galaxyproject.org

___
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using reply all in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

  http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

  http://lists.bx.psu.edu/