Hello Jianguang,
This general protocol is also in the RNA-seq tutorial:
http://main.g2.bx.psu.edu/u/jeremy/p/galaxy-rna-seq-analysis-exercise
--> Understanding and QCing the reads
That said, I had a sample of your data from before and I ran FastQC on
it and see what you mean, the quality drops off steadily after the first
10 bases or so, then below phred+20 around the middle of the sequence
(for both ends).
There are a few options -
1 - Do as Ann suggests and just leave these alone and test to see what
happens in TopHat. If the mapping fails, then you will know that you
need to do some quality cleanup.
2 - Use the FastQC results to decide on a lower quality score boundary
and trim the very worst sequences. Because of the length, yes, take care
not to remove too much. As I stated, from the sample I looked at, even
phred+20 would probably clip too aggressively.
In general it is best to do as little manipulation as possible with
expression data. Some testing on your part will be needed to identify
the correct processing, and the same process will not apply to all
datasets. But the general path outlined in the tutorial is a good one
for what you are trying to do and should be able to address your questions.
Take care,
Jen
Galaxy team
On 8/23/12 7:40 AM, Du, Jianguang wrote:
Dear All,
I am analysing RNA-seq datasets for the differential splicing events
between cell types. My reads are 36bp long. In order to increase the
quality of reads, I need to trim some nucleotides from ends. How many
nucleotides can I trim? I am afraid that if I trim too much, the
reliability of the alingment will be affected.
Thanks in advance.
Jianguang
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/
--
Jennifer Jackson
http://galaxyproject.org
___________________________________________________________
The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at usegalaxy.org. Please keep all replies on the list by
using "reply all" in your mail client. For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:
http://lists.bx.psu.edu/listinfo/galaxy-dev
To manage your subscriptions to this and other Galaxy lists,
please use the interface at:
http://lists.bx.psu.edu/