Hello,
I was looking at the first few sequences in the readAligned object
e.g.
> head(sread(aligned))
A DNAStringSet instance of length 6
width seq
[1] 36 AACCCTAACCCTAACCCTAACCTTAACCTAACCTTA
[2] 36 TCCGCCTTCAGAGTACCACCGAAATCTGTGCAGAGG
[3] 36 GCCTCTCTGCGCCTGCGCCGGCGGCGTTTCGTTCTC
[4] 36 GCGCGGCGCGCCTCTCGGCGCCTGCGCCGGCGGAGG
[5] 36 GAGGAAAAAGGCAGGACAGAATTACGAGGTGCTGGC
[6] 36 GAAAAAGGCAGGACAGAATTACGAGATGCTGGCNCA
and I looked at their strands
> head(strand(aligned))
[1] + + - - + +
When I did a search in the .map file relating to this alignment, I was able to
find the first 2 sequences (which are on the + strand), but not the 3rd, nor
its complement. Same for the 4th which is also - strand. To get a complement I
used Biostrings::complementSeq.
Could this be a bug in the way that the readAligned object is created ?
I also noticed that the mismatch column for negative stranded reads is exactly
the same as the in .map file (when I found them by chr and position - 1, rather
than sequence).
Should this be = (coordinate - 35) for negative reads since Bowtie reports all
mismatches from the 5' end of the read and ShortRead coordinates are in terms
of sequencing cycles ?
--------------------------------------
Dario Strbenac
Research Assistant
Cancer Epigenetics
Garvan Institute of Medical Research
Darlinghurst NSW 2010
Australia
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing