Thank you, Peter! I think I should put more detailed information here.

What I'm doing is piRNA data. Two groups of piRNA (named sense and antisense)are in the library. As I said, they are complementary to each other for about 10 nt, while the whole length is about 30nt. For the sense group, they share the feature of having an "A" at their 10th.

In this case, how can I deal with it? One possible way come up is inverting all sequences and aligning them.


Quoting "Peter Cock" <>:

On Mon, Nov 26, 2012 at 6:47 PM, Zhiqiang Shu <> wrote:
Hi, Galaxy users!

I have a question on how to find out sense and antisense sequence. I've got
RNA seq data in the fastq format. The sequences inside are partially
complementary to each other (complementary is 10nt, while entire is about
30nt). How can I separate these sequences into two groups: sense and

Depending on how your sequences were prepared, you might be able to
look for a poly-A tail as a clue to orientation. Another approach is to
compare the (assembled) transcripts to known genes and if you only
get matches on one strand that is probably the correct orientation.

(one thing I know is for the sense sequence the 10th nucleotide is
always "A")?

Why is that? Is this related to your library preparation?


Zhiqiang Shu/Deng Lab
Department of Biological Science
Florida State University
319 Stadium Dr.
Tallahassee, FL, USA, 32306-4295

The Galaxy User list should be used for the discussion of
Galaxy analysis and other features on the public server
at  Please keep all replies on the list by
using "reply all" in your mail client.  For discussion of
local Galaxy instances and the Galaxy source code, please
use the Galaxy Development list:

To manage your subscriptions to this and other Galaxy lists,
please use the interface at:

Reply via email to