Hello Jianguang,

This general protocol is also in the RNA-seq tutorial:
http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise
--> Understanding and QCing the reads

That said, I had a sample of your data from before and I ran FastQC on it and see what you mean, the quality drops off steadily after the first 10 bases or so, then below phred+20 around the middle of the sequence (for both ends).

There are a few options -

1 - Do as Ann suggests and just leave these alone and test to see what happens in TopHat. If the mapping fails, then you will know that you need to do some quality cleanup.

2 - Use the FastQC results to decide on a lower quality score boundary and trim the very worst sequences. Because of the length, yes, take care not to remove too much. As I stated, from the sample I looked at, even phred+20 would probably clip too aggressively.

In general it is best to do as little manipulation as possible with expression data. Some testing on your part will be needed to identify the correct processing, and the same process will not apply to all datasets. But the general path outlined in the tutorial is a good one for what you are trying to do and should be able to address your questions.

Take care,

Jen
Galaxy team




On 8/23/12 7:40 AM, Du, Jianguang wrote:
Dear All,

I am analysing RNA-seq datasets for the differential splicing events
between cell types. My reads are 36bp long. In order to increase the
quality of reads, I need to trim some nucleotides from ends. How many
nucleotides can I trim? I am afraid that if I trim too much, the
reliability of the alingment will be affected.

Thanks in advance.

Jianguang



___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

   http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

   http://lists.bx.psu.edu/


--
Jennifer Jackson
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org.  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

 http://lists.bx.psu.edu/listinfo/galaxy-dev

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

 http://lists.bx.psu.edu/

Reply via email to