It sounded to me like he simply wanted the distribution of last bases,
which happen to be all T in your example. We can extract and tabulate
them more directly like this:
# example of trimLRPatterns value:
> A = DNAStringSet(c("ACAT", "CAT", "AGGGCGT"))
> n = nchar(A)
> last = narrow(A, start=n, end=n)
> alphabetFrequency(last, baseOnly=TRUE, collapse=TRUE)
A C G T other
0 0 0 3 0
On Aug 26, 2010, at 9:29 AM, Joern Toedling wrote:
Hi,
have a look at the "shift" argument of the function consensusMatrix
from
Biostrings.
This code example should correspond to your question. Three
nucleotide strings
are aligned at their last position and the sequence composition is
obtained:
A <- DNAStringSet(c("ACAT", "CAT", "AGGGCGT"))
maxlen <- max(nchar(A))
consensusMatrix(A, shift=maxlen-nchar(A), baseOnly=TRUE)
I tested this with Biostrings_2.17.29, but I guess that it works
with the
current release version, too.
Regards,
Joern
On Thu, 26 Aug 2010 07:35:01 -0500, joseph franklin wrote
Hi,
I've been trimming adapters from reads using trimLRPatterns. The
resulting, trimmed set contains a heterogenous mix of widths: from
~18-35 nt. Can anyone guide me toward an elegant way to find the
nucleotide composition of the final (right-most) cycle for each of
the trimmed reads?
Many thanks,
Joe Franklin
---
Joern Toedling
Institut Curie -- U900
26 rue d'Ulm, 75005 Paris, FRANCE
Tel. +33 (0)156246927
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing
_______________________________________________
Bioc-sig-sequencing mailing list
[email protected]
https://stat.ethz.ch/mailman/listinfo/bioc-sig-sequencing